en.planet.wikimedia

January 26, 2017

Wikimedia Foundation

Wikimedia Research Newsletter, December 2016

Getting more female editors may not increase the ratio of articles about women

Reviewed by Reem Al-Kashif

A bachelor’s degree thesis by Feli Nicolaes[1] finds that, contrary to the general perception, male and female editors do not tend to edit biographical articles on people of their own gender.

Previous research suggested that one solution to the lack of Wikipedia’s biographies of women could be to increase the number of female editors. This was based on the assumption that women would prefer to edit women’s biographies, and men would prefer to edit men’s biographies. Nicolaes refers to this as homophily in her thesis, “Gender bias on Wikipedia: an analysis of the affiliation network”. However, homophily has so far neither been formally investigated nor proved to exist in Wikipedia. Nicolaes analyzes this using datasets from her research group at the University of Amsterdam, of English Wikipedia editors and the pages they edit. She tracks the editing behavior of both self-identified male and female editors on Wikipedia. Contrary to the mainstream assumption, homophily was not found. In other words, female users’ edits are not focused on female biography pages. In fact, Nicolaes finds “inverted homophily” when considering female users who edit a single biographical article more than 200 times: they are more likely to direct this amount of attention to biography articles about men than male editors are.

This brings to mind an initiative to increase content about women—be it biography articles or other content related to women—that has been live since December 2015 in the Arabic Wikipedia. The initiative is in a form of contest where male and female editors try to achieve as much as they can from their self-set goals. Over the four rounds of the contest, only one woman reached the top three in two rounds. So, if the goal is to add more content about women, bringing more women might not be useful. However, Nicolaes also argues that the study should be replicated on larger datasets to validate the results. It remains to be seen whether the same editor behaviour exists in other language editions. Another limitation of the study is its apparent reliance on the gender information that editors publicly state in their user preferences—a method that is widely used but may be susceptible to biases (discussed in more detail in this review).

Theorizing the foundations of the gender gap

Reviewed by Aaron Shaw

In a forthcoming paper, “‘Anyone can edit’ not everyone does: Wikipedia and the gender gap”[2], Heather Ford and Judy Wajcman use some of the theoretical tools of feminist science and technology studies (STS) to describe underpinnings of the Wikipedia gender gap. The authors argue that three aspects of Wikipedia’s infrastructure define it as a particularly masculine or male-dominated project:

(1) the epistemological foundations of what constitutes valid encyclopedic knowledge,
(2) Wikipedia’s software infrastructure, and
(3) Wikipedia’s policy infrastructure.

The authors argue that each of these arenas represents a space where male activity and masculine norms of truth, scientific fact, legitimacy, and freedom define boundaries of legitimate contribution and action. Accordingly, these boundaries of legitimate contribution and action systematically exclude or devalue perspectives and contributions that could overcome the lack of female participation or perspectives in the Wikipedia projects. The result, according to Ford and Wajcman, is that Wikipedia has created a novel and powerful form of knowledge-production expertise on a foundation that reproduces existing gender hierarchies and inequalities.

How old and new astronomy papers are being cited

Reviewed by Piotr Konieczny

The author analyzes[3] Wikipedia’s citations to academic peer reviewed articles, finding that “older papers from before 2008 are increasingly less likely to be cited”. The authors attempt to use Wikipedia citations as a proxy for public interest in astronomy, though the analysis makes no comparison to other research about public interest in sciences. The article notes that citations to articles from 2008 are most common, and it represents the peak of citations, with fewer and fewer citations for years since 2008. The analysis is also limited due to the cut-off date (1996), “because Scopus indexing of journals changes in this year”. The author concludes that the observed citation pattern is likely “consistent with a moderate tendency towards obsolescence in public interest in research”, as papers become obsolete and newer ones are more likely to be cited; older papers are cited for timeless, uncontroversial facts, and newer for newer findings. They also note that the late 2000s, i.e. the years around 2008, may represent when most of Wikipedia’s content in astronomy was created, through this is not backed up by much besides speculation. Overall, it is an interesting question, but one that does not provide any surprising insights.

Wikipedia is not a suitable source for election predictions

Reviewed by Piotr Konieczny

The topic of this conference paper, “Election prediction based on Wikipedia pageviews”,[4] is certainly timely. The authors look at which of Wikipedia’s articles related to the US presidential election registered high popularity, and then ask whether elections can be predicted based “on the number of views the spiking pages have and on the correlation between these pages and the presidential nominees or their political program”. They provide an online visualization showing some “Wikipedia topics that have spiked before, during or after [an] election event.”

The authors limit themselves (reasonably) to the English and Spanish Wikipedias. They do a good job of presenting their methods, and outlining problems with gathering data on popularity of articles—something that would be much easier if Wikipedia articles and databases were more friendly when it comes to information about their popularity. Within the limitations described in the paper, the authors conclude that Wikipedia articles about politicians are used mostly after, not before or during debates or other events such as primaries or elections, which suggests that they are not used for fact checking but instead as an information source after the event. “Wikipedia is not, in fact, a reliable polling source”, write the authors, based on (this could be clarified further) the fact that people check Wikipedia after the events, not before them, hence making Wikipedia’s pageviews problematic for prediction.

“Black Lives Matter in Wikipedia: Collaboration and collective memory around online social movements”

Reviewed by Piotr Konieczny
Protesters lying down over rail tracks with a "Black Lives Matter" banner.

Black Lives Matter die-in protesting alleged police brutality in 2015

In this paper,[5] the researchers look at the relation between the Black Lives Matter (BLM) social movement and its coverage in Wikipedia, asking the following research questions:

  1. “How has Wikipedia editing activity and the coverage of BLM movement events changed over time?”
  2. “How have Wikipedians collaborated across articles about events and the BLM movement?” and
  3. “How are events on Wikipedia re-appraised following new events?”

They aim to contribute to academic discourse on social movements and claim to describe “knowledge production and collective memory in a social computing system as the movement and related events are happening.” They conclude that Wikipedia is a neutral platform, but does indirectly support (or hinder) the movement (or its opponents) by virtue of increased visibility, in the same vein as coverage by the media would. The quality of the movement’s history and documentation on Wikipedia is judged to be of higher value, accessibility, and quality than snapshots on social media platforms like Twitter. Wikipedia also provides space for interested editors to work on articles indirectly related to BLM, further increasing the visibility of related topics, as interested editors move beyond direct BLM articles to other aspects. Examples include historical articles about events preceding BLM that would probably not be written/expanded on in Wikipedia if not for the rise of the BLM movement. The authors conclude that social movement activists can use Wikipedia to document their activities without compromising Wikipedia’s neutrality or other policies: “Without breaking with community norms like NPOV, Wikipedia became a site of collective memory documenting mourning practices as well as tracing how memories were encoded and re-interpreted.” This is a valuable argument that draws interesting connections between Wikipedia and social movements, particularly considering that some (like this reviewer) consider Wikipedia itself to be a social movement.

Briefly

Conferences and events

The third annual Wiki Workshop will take place on April 4 as part of the WWW2017 conference in Perth, Australia. The workshop serves as a platform for Wikimedia researchers to get together on an annual basis and share their research with each other (see also our overview of the papers from the 2016 edition). All Wikimedia researchers are encouraged to submit papers for the workshop and attend it. More details at the call for papers.

See the research events page on Meta-wiki for other upcoming conferences and events, including submission deadlines.

Other recent publications

Other recent publications that could not be covered in time for this issue include the items listed below. Contributions are always welcome for reviewing or summarizing newly published research.

  • “Facilitating the use of Wikidata in Wikimedia projects with a user-centered design approach”[6] From the abstract: “In its current form, [data from Wikidata] is not used to its full potential [on other Wikimedia projects] for a multitude of reasons, as user acceptance is low and the process of data integration is unintuitive and complicated for users. This thesis aims to develop a concept using user-centered design to facilitate the editing of Wikidata data from Wikipedia. With the involvement of the Wikimedia community, a system is designed which integrates with pre-existing work flows.”
  • “A corpus of Wikipedia discussions: over the years, with topic, power and gender labels”[7] From the abstract: “… we present a large corpus of Wikipedia Talk page discussions that are collected from a broad range of topics, containing discussions that happened over a period of 15 years. The dataset contains 166,322 discussion threads, across 1236 articles/topics that span 15 different topic categories or domains. The dataset also captures whether the post is made by an registered user or not, and whether he/she was an administrator at the time of making the post. It also captures the Wikipedia age of editors in terms of number of months spent as an editor, as well as their gender.”
  • “Wikipedia and the politics of openness” Two reviews of the 2014 book with this title[supp 1], in the journal Information, Communication & Society[8] and in Contemporary Sociology: A Journal of Reviews[9], with the latter summarizing the book as follows: “Tkacz’s text has three main empirical chapters. The first sorts out the ‘politics of openness,’ by which he means how collaboration emerges and forms in an open-ended context. The second empirical contribution is about the possibility that the framing of social interaction might, by itself, be enough to create order and encourage productivity in an environment like Wikipedia. … The third empirical contribution is that project exit has an extremely important role in maintaining the stability of Wikipedia. As people develop projects, they create parallel, break-off versions of a project [forks].”
  • “Derivation of ‘is a’ taxonomy from Wikipedia category graph”[10]
  • “‘En Wikipedia no se escribe jugando’: Identidad y motivación en 10 wikipedistas regiomontanos.”[11] From the English abstract: “This study qualitatively analyses the contributions in the talk pages of the Spanish Wikipedia by the ten most-active registered users in Monterrey, Mexico. Using virtual ethnography … this research finds that these self-styled ‘wikipedistas’ assume the site’s collective identity when interacting with anonymous users, and that their main motivations for ongoing participation are not related to the repository of knowledge in itself, but to their group dynamics and inter-personal relationships within the community.”
  • “Schreiben in der Wikipedia” (“Writing in Wikipedia”)[12] From the book (translated): “From the perspective of Wikipedia research, it can observed that Wikipedia must not be regarded as a community medium [‘gemeinschaftliches Medium’] per se, but that it reflects a conglomerate of individual and community writing processes, which in turn both influence the text genesis, with differing scopes. This chronological development is laid open here for the first time in case of some exemplary article texts, and subsequently, specific properties of each article topic are related to creation of the article that is basd on it.”
  • “Beyond the Book: linking books to Wikipedia”[13] From the abstract: “The book translation market is a topic of interest in literary studies, but the reasons why a book is selected for translation are not well understood. The Beyond the Book project investigates whether web resources like Wikipedia can be used to establish the level of cultural bias. This work describes the eScience tools used to estimate the cultural appeal of a book: semantic linking is used to identify key words in the text of the book, and afterwards the revision information from corresponding Wikipedia articles is examined to identify countries that generated a more than average amount of contributions to those articles. … We assume a lack of contributions from a country may indicate a gap in the knowledge of readers from that country. We assume that a book dealing with that concept could be more exotic and therefore more appealing for certain readers … An indication of the ‘level of exoticness’ thus could help a reader/publisher to decide to read/translate the book or not. Experimental results are presented for four selected books from a set of 564 books written in Dutch or translated into Dutch, assessing their potential appeal for a Canadian audience.”
  • “A multilingual approach to discover cross-language links in Wikipedia”[14] From the abstract: “… given a Wikipedia article (the source) EurekaCL uses the multilingual and semantic features of BabelNet 2.0 in order to efficiently identify a set of candidate articles in a target language that are likely to cover the same topic as the source. The Wikipedia graph structure is then exploited both to prune and to rank the candidates. Our evaluation carried out on 42,000 pairs of articles in eight language versions of Wikipedia shows that our candidate selection and pruning procedures allow an effective selection of candidates which significantly helps the determination of the correct article in the target language version.”
  • “Analyzing organizational routines in online knowledge collaborations: a case for sequence analysis in CSCW[15] From the abstract: “Research into socio-technical systems like Wikipedia has overlooked important structural patterns in the coordination of distributed work. This paper argues for a conceptual reorientation towards sequences as a fundamental unit of analysis for understanding work routines in online knowledge collaboration. Using a data set of 37,515 revisions from 16,616 unique editors to 96 Wikipedia articles as a case study, we analyze the prevalence and significance of different sequences of editing patterns.” See also slides and a separate review by Aaron Halfaker (“This is a weird paper. It isn’t actually a study. It’s more like a methods position paper.”)
  • “Wikipedia: medium and model of collaborative public diplomacy”[16] From the abstract: “Taking a case-study approach, the article posits that Wikipedia holds a dual relevance for public diplomacy 2.0: first as a medium; and second, as a model for public diplomacy’s evolving process. Exploring Wikipedia’s folksonomy, crowd-sourced through intense and organic collaboration, provides insights into the potential of collective agency and symbolic advocacy.”
  • “Enabling fine-grained RDF data completeness assessment”[17] From the abstract: “The idea of the paper is to have completeness information over RDF data sources and use it for checking query completeness. In particular, [for Wikidata,] an indexing technique was developed to allow to scale completeness reasoning to Wikidata-scale data sources. The applicability of the framework was verified using Wikidata and COOL-WD, a completeness tool for Wikidata, was developed. The tool is available at http://cool-wd.inf.unibz.it/
  • “Linked data quality of DBpedia, Freebase, OpenCyc, Wikidata, and YAGO”[18] From the abstract: “In recent years, several noteworthy large, cross-domain and openly available knowledge graphs (KGs) have been created. These include DBpedia, Freebase, OpenCyc, Wikidata, and YAGO. Although extensively in use, these KGs have not been subject to an in-depth comparison so far. In this survey, we provide data quality criteria according to which KGs can be analyzed and analyze and compare the above mentioned KGs.” From the paper: “… Wikidata covers all relations of the gold standard, even though it contains considerably less relations [than Freebase] (1,874 vs. 70,802). The Wikidata methodology to let users propose new relations, to discuss about their coverage and reach, and finally to approve or disapprove the relations, seems to be appropriate.”

    Mus musculus had all its genes imported into Wikidata

  • “Wikidata as a semantic framework for the Gene Wiki initiative”[19] From the abstract: “… we imported all human and mouse genes, and all human and mouse proteins into Wikidata. In total, 59 721 human genes and 73 355 mouse genes have been imported from NCBI and 27 306 human proteins and 16 728 mouse proteins have been imported from the Swissprot subset of UniProt. … The first use case for these data is to populate Wikipedia Gene Wiki infoboxes directly from Wikidata with the data integrated above. This enables immediate updates of the Gene Wiki infoboxes as soon as the data in Wikidata are modified. … Apart from the Gene Wiki infobox use case, a SPARQL endpoint and exporting functionality to several standard formats (e.g. JSON, XML) enable use of the data by scientists.”
  • “Connecting every bit of knowledge: The structure of Wikipedia’s First Link Network”[20] From the abstract: “By following the first link in each article, we algorithmically construct a directed network of all 4.7 million articles: Wikipedia’s First Link Network. … By traversing every path, we measure the accumulation of first links, path lengths, groups of path-connected articles, and cycles. … we find scale-free distributions describe path length, accumulation, and influence. Far from dispersed, first links disproportionately accumulate at a few articles—flowing from specific to general and culminating around fundamental notions such as Community, State, and Science. Philosophy directs more paths than any other article by two orders of magnitude. We also observe a gravitation towards topical articles such as Health Care and Fossil Fuel.” (See also media coverage: “All Wikipedia Roads Lead to Philosophy, but Some of Them Go Through Southeast Europe First” and Wikipedia:Getting to Philosophy)

References

  1. Nicolaes, Feli (2016-06-24). “Gender Bias on Wikipedia: An analysis of the affilliation network” (PDF). Faculty of Science, Science Park 904, 1098 XH Amsterdam: University of Amsterdam. 
  2. Ford, Heather; Wajcman, Judy. Anyone can edit’ not everyone does: Wikipedia and the gender gap” (PDF). Social Studies of Science. ISSN 0306-3127. >
  3. Thelwall, Mike (2016-11-14). “Does astronomy research become too dated for the public? Wikipedia citations to astronomy and astrophysics journal articles 1996–2014”. El Profesional de la Información 25 (6): 893–900. doi:10.3145/epi.2016.nov.06. ISSN 1699-2407. >
  4. Ciocirdel, Georgiana Diana; Varga, Mihai (2016). Election prediction based on Wikipedia pageviews (PDF). p. 9. 
  5. Twyman, Marlon; Keegan, Brian C.; Shaw, Aaron (2016-11-03). “Black Lives Matter in Wikipedia: Collaboration and collective memory around online social movements”. arXiv:1611.01257 [physics]. doi:10.1145/2998181.2998232. 
  6. Kritschmar, Charlie (2016-03-03). Facilitating the use of Wikidata in Wikimedia projects with a user-centered design approach (PDF) (Thesis).  Bachelor’s thesis written at the HTW Berlin in Internationale Medieninformatik
  7. Prabhakaran, Vinodkumar; Rambow, Owen (2016). “A corpus of Wikipedia discussions: over the years, with topic, power and gender labels”. p. 5. 
  8. Gotkin, Kevin (2016-02-24). “Wikipedia and the politics of openness”. Information, Communication & Society 0 (0): 1–3. doi:10.1080/1369118X.2016.1151911. ISSN 1369-118X.  Closed access
  9. Rojas, Fabio (2016-03-01). “Wikipedia and the Politics of Openness”. Contemporary Sociology: A Journal of Reviews 45 (2): 251–252. doi:10.1177/0094306116629410lll. ISSN 0094-3061. 
  10. Ben Aouicha, Mohamed; Hadj Taieb, Mohamed Ali; Ezzeddine, Malek (2016-04-01). “Derivation of “is a” taxonomy from Wikipedia category graph”. Engineering Applications of Artificial Intelligence 50: 265–286. doi:10.1016/j.engappai.2016.01.033. ISSN 0952-1976.  Closed access
  11. Corona Reyes, Sergio; Reyes, Sergio Antonio Corona; Yáñez, Brenda Azucena Muñoz (2015-12-29). ““En Wikipedia no se escribe jugando”: Identidad y motivación en 10 wikipedistas regiomontanos.”. Global Media Journal México 12 (23). 
  12. Kallass, Dr Kerstin (2015). Schreiben in der Wikipedia. Springer Fachmedien Wiesbaden. doi:10.1007/978-3-658-08265-9. ISBN 978-3-658-08265-9.  Closed access (in German)
  13. Martinez-Ortiz, C.; Koolen, M.; Buschenhenke, F.; Dalen-Oskam, K. v (2015-08-01). “Beyond the Book: linking books to Wikipedia”. 2015 IEEE 11th International Conference on e-Science (e-Science). 2015 IEEE 11th International Conference on e-Science (e-Science). pp. 12–21. doi:10.1109/eScience.2015.12.  Closed access
  14. Bennacer, Nacéra; Vioulès, Mia Johnson; López, Maximiliano Ariel; Quercini, Gianluca (2015-11-01). “A multilingual approach to discover cross-language links in Wikipedia”. In Jianyong Wang, Wojciech Cellary, Dingding Wang, Hua Wang, Shu-Ching Chen, Tao Li, Yanchun Zhang (eds.). Web Information Systems Engineering – WISE 2015. Lecture Notes in Computer Science. Springer International Publishing. pp. 539–553. ISBN 9783319261898.  Closed access
  15. Keegan, Brian C.; Lev, Shakked; Arazy, Ofer (2015-08-19). “Analyzing organizational routines in online knowledge collaborations: a case for sequence analysis in CSCW”. arXiv:1508.04819 [physics, stat]. 
  16. Byrne, Caitlin; Johnston, Jane (2015-10-23). “Wikipedia: medium and model of collaborative public diplomacy”. The Hague Journal of Diplomacy 10 (4): 396–419. doi:10.1163/1871191X-12341312. ISSN 1871-191X.  Closed access
  17. Darari, Fariz; Razniewski, Simon; Prasojo, Radityo Eko; Nutt, Werner (2016). “Enabling fine-grained RDF data completeness assessment”. Proceedings of the 16th International Conference on Web Engineering (ICWE ’16). Lugano, Switzerland. 2016. Springer International Publishing. doi:10.1007/978-3-319-38791-8_10.  Closed access (preprint freely available online)
  18. Färber, Michael; Ell, Basil; Menne, Carsten; Rettinger, Achim; Bartscherer, Frederic (2016). Linked data quality of DBpedia, Freebase, OpenCyc, Wikidata, and YAGO. 
  19. Burgstaller-Muehlbacher, Sebastian; Waagmeester, Andra; Mitraka, Elvira; Turner, Julia; Putman, Tim; Leong, Justin; Naik, Chinmay; Pavlidis, Paul; Schriml, Lynn; Good, Benjamin M.; Su, Andrew I. (2016-01-01). “Wikidata as a semantic framework for the Gene Wiki initiative”. Database 2016: 015. doi:10.1093/database/baw015. ISSN 1758-0463. PMID 26989148. 
  20. Ibrahim, Mark; Danforth, Christopher M.; Dodds, Peter Sheridan (2016-05-01). “Connecting every bit of knowledge: The structure of Wikipedia’s First Link Network”. arXiv:1605.00309 [cs]. 
Supplementary references:
  1. Tkacz, Nathaniel (2014-12-19). Wikipedia and the politics of openness. Chicago ; London: University Of Chicago Press. ISBN 9780226192277. 

Wikimedia Research Newsletter
Vol: 6 • Issue: 12 • December 2016
This newletter is brought to you by the Wikimedia Research Committee and The Signpost
Subscribe: Syndicate the Wikimedia Research Newsletter feed Email WikiResearch on Twitter WikiResearch on Facebook[archives] [signpost edition] [contribute] [research index]


by Tilman Bayer at January 26, 2017 06:12 AM

January 25, 2017

Wikimedia UK

The first week’s highlights from #1lib1ref

We are just over a week into the second annual #1lib1ref campaign, where we “imagine a world where every librarian adds one more reference to Wikipedia.”

Jerwood Library, Trinity Hall, Cambridge. Photo by Andrew Dunn, CC BY-SA 2.0.

We are just over a week into the second annual #1lib1ref campaign, where we “imagine a world where every librarian adds one more reference to Wikipedia.”

Wikipedia is based on real facts, backed up by citations—and librarians are expert at finding supporting research.

This year’s campaign launched on January 15, to celebrate Wikipedia’s sixteenth birthday.  As of Monday, participants have made over 1,543 contributions on 1,065 articles in 15 different languages.

We know that more librarian meetups, events, editathons, webinars, coffee hours, tweets, photos, sticker-selfies, blog posts and more have happened—share them on social media to help spread the campaign! Here are a few highlights from the week.

IFLA white papers

Following a year-long conversation with the International Federation of Library Associations, they kicked off #1lib1ref by officially publishing two “Opportunities Papers” emphasizing the potential for collaboration between Wikipedia and academic and public libraries.

Showing the story of a citation

#1lib1ref provides a great opportunity for communities to create resources about how to contribute to Wikimedia projects. Below are great new ones made for the campaign:

Video via Wikimedia Germany and the Simpleshow Foundation, CC BY-SA 4.0.
  1. Wikimedia Deutschland made a great video explainer in both English and German.
  2. NCompass Live hosted a webinar: The Wikimedia Foundation’s Alex Stinson alongside Wiki-Librarians Jessamyn West, Phoebe Ayers, Merrilee Profitt and Kelly Doyle provided an overview of the ways different library communities can improve Wikipedia.
  3. Wikipedian in Residence at the University of Edinburgh, Ewan McAndrew, developed excellent introductory videos for how to contribute to #1lib1ref!

A global story grows bigger

The campaign is already bigger than last year, as we’ve already surpassed our contributions from last year and we’re not even finished yet.  To capture the scope and excitement, we created a Storify to capture and share some of the most interesting of last week’s tweets, which numbered over 1,000.

We still have two more weeks to go! Keep pushing to get your local librarians and libraries involved with the campaign, and help share the gift of a citation with the world.

Alex Stinson, GLAM Strategist
Jake Orlowitz, Head of the Wikipedia Library
Wikimedia Foundation

by Alex Stinson at January 25, 2017 04:01 PM

Semantic MediaWiki

Help:Embedded format

Help:Embedded format
Embedded format
Embed selected articles.
Available languages
deenzh-hans
Further Information
Provided by: Semantic MediaWiki
Added: 0.7
Removed: still supported
Requirements: none
Format name: embedded
Enabled by default: 
Indicates whether the result format is enabled by default upon installation of the respective extension.
yes
Authors: Markus Krötzsch
Categories: misc
Group:
Table of Contents

↓ INFO ↓

The result format embedded is used to embed the contents of the pages in a query result into a page. The embedding uses MediaWiki transclusion (like when inserting a template), so the tags <includeonly> and <noinclude> work for controlling what is displayed.

Parameters

General

Parameter Type Default Description
source text empty Alternative query source
limit whole number 50 The maximum number of results to return
offset whole number 0 The offset of the first result
link text all Show values as links
sort list of texts empty Property to sort the query by
order list of texts empty Order of the query sort
headers text show Display the headers/property names
mainlabel text no The label to give to the main page name
intro text empty The text to display before the query results, if there are any
outro text empty The text to display after the query results, if there are any
searchlabel text ... further results Text for continuing the search
default text empty The text to display if there are no query results

Format specific

Parameter Type Default Description
embedformat text h1 The HTML tag used to define headings
embedonly yes/no no Display no headings

The embedded format introduces the following additional parameters:

  • embedformat: this defines which kinds of headings to use when pages are embedded, may be a heading level, i.e. one of h1, h2, h3, h4, h5, h6, or a description of a list format, i.e. one of ul and ol
  • embedonly: if this parameter has any value (e.g. yes), then no headings are used for the embedded pages at all.

Example

The following creates a list of recent news posted on this site (like in a blog):

{{#ask:
 News date::+
 language code::en
 |sort=news date
 |order=descending
 |format=embedded
 |embedformat=h3
 |searchlabel= <br />[view older news]
 |limit=3
}}

This produces the following output:

Semantic MediaWiki 2.4.5 released

English

Semantic MediaWiki 2.4.5 (SMW 2.4.5) has been released today as a new version of Semantic MediaWiki.

This new version is a minor release and provides bugfixes for the current 2.4 branch of Semantic MediaWiki. Please refer to the help page on installing Semantic MediaWiki to get detailed instructions on how to install or upgrade.

Semantic MediaWiki 2.4.4 released

English

Semantic MediaWiki 2.4.4 (SMW 2.4.4) has been released today as a new version of Semantic MediaWiki.

This new version is a minor release and provides bugfixes for MySQL 5.7 issues of the current 2.4 branch of Semantic MediaWiki. Please refer to the help page on installing Semantic MediaWiki to get detailed instructions on how to install or upgrade.

Semantic MediaWiki 2.4.3 released

English

Semantic MediaWiki 2.4.3 (SMW 2.4.3) has been released today as a new version of Semantic MediaWiki.

This new version is a minor release and provides bugfixes for the current 2.4 branch of Semantic MediaWiki. Please refer to the help page on installing Semantic MediaWiki to get detailed instructions on how to install or upgrade.
[view older news]

NoteNote: The newline (<br />) is used to put the further results link on a separate line.

Remarks

Note that embedding pages may accidently include category statements if the embedded articles have any categories. Use <noinclude> to prevent this, e.g. by writing

<noinclude>Category:News feed</noinclude>

SMW will take care that embedded articles do not import their semantic annotations, so these need not be treated specifically.

Also note that printout statements have no effect on embedding queries.

Limitations

You cannot use the embed format to embed a query from another page if that query relies on the magic word {{PAGENAME}}.



This documentation page applies to all SMW versions from 0.7 to the most current version.
      Other languages: defrzh-hans

Help:Embedded format en 0.7


by Wladek92 at January 25, 2017 11:18 AM

Gerard Meijssen

#Wikidata - Sultanism anyone?

The definition of "sultanism" is:
In political science, sultanism is a form of authoritarian government characterized by the extreme personal presence of the ruler in all elements of governance. The ruler may or may not be present in economic or social life, and thus there may be pluralism in these areas, but this is never true of political power.
There are prominent scientists who use the term. It  therefore must be applicable and indeed there are some who consider that any sultanate is defined by it.  The problem is that the name is very much linked to Islam but that it equally applies to monarchs like Henry VIII. King Henry started the church of England but the way that the church of England came to be makes sultanism applicable.

It does not really matter how the concept of sultanism came to be. The name chosen is extremely prejudicial. The problem we face is that words and facts matter. Both Wikipedia and Wikidata represent a neutral point of view and therefore a concept like sultanism deserves a place. However, when such a concept is to be applied, it needs to applied in a neutral way. It means that you can not point to a country and say "sultanate". It means that it applies to a ruler and it therefore applies to Henry as much as to an evil genius like Jafar.
Thanks,
     GerardM

by Gerard Meijssen (noreply@blogger.com) at January 25, 2017 07:09 AM

January 24, 2017

Wikimedia Foundation

The first week’s highlights from #1lib1ref

Jerwood Library, Trinity Hall, Cambridge. Photo by Andrew Dunn, CC BY-SA 2.0.

Jerwood Library, Trinity Hall, Cambridge. Photo by Andrew Dunn, CC BY-SA 2.0.

We are just over a week into the second annual #1lib1ref campaign, where we “imagine a world where every librarian adds one more reference to Wikipedia.”

Wikipedia is based on real facts, backed up by citations—and librarians are expert at finding supporting research.

This year’s campaign launched on January 15, to celebrate Wikipedia’s sixteenth birthday.  As of Monday, participants have made over 1,543 contributions on 1,065 articles in 15 different languages.

We know that more librarian meetups, events, editathons, webinars, coffee hours, tweets, photos, sticker-selfies, blog posts and more have happened—share them on social media to help spread the campaign! Here are a few highlights from the week.

IFLA white papers

Following a year-long conversation with the International Federation of Library Associations, they kicked off #1lib1ref by officially publishing two “Opportunities Papers” emphasizing the potential for collaboration between Wikipedia and academic and public libraries.

Showing the story of a citation

#1lib1ref provides a great opportunity for communities to create resources about how to contribute to Wikimedia projects. Below are great new ones made for the campaign:

File:Explainer Video - Using good sources on wikipedia.webm

Video via Wikimedia Germany and the Simpleshow Foundation, CC BY-SA 4.0.

 

  1. Wikimedia Deutschland made a great video explainer in both English and German.
  2. NCompass Live hosted a webinar: The Wikimedia Foundation’s Alex Stinson alongside Wiki-Librarians Jessamyn West, Phoebe Ayers, Merrilee Profitt and Kelly Doyle provided an overview of the ways different library communities can improve Wikipedia.
  3. Wikipedian in Residence at the University of Edinburgh, Ewan McAndrew, developed excellent introductory videos for how to contribute to #1lib1ref!

A global story grows bigger

The campaign is already bigger than last year, as we’ve already surpassed our contributions from last year and we’re not even finished yet.  To capture the scope and excitement, we created a Storify to capture and share some of the most interesting of last week’s tweets, which numbered over 1,000.

We still have two more weeks to go! Keep pushing to get your local librarians and libraries involved with the campaign, and help share the gift of a citation with the world.

Alex Stinson, GLAM Strategist
Jake Orlowitz, Head of the Wikipedia Library
Wikimedia Foundation

Image by Spiritia, public domain/CC0.

by Alex Stinson and Jake Orlowitz at January 24, 2017 08:43 PM

William Beutler

#1Lib1Ref and Adventures in Practical Encyclopedia-Building

Wikipedia_Library_owlThe Wikipedian has long been of the opinion, perhaps controversial on Wikipedia, that it is a mistake to think that it can recruit the entire world to become Wikipedia editors. Yet this is the premise upon which so many aspects of Wikipedia’s platform are based.

Start with the fact that anyone can edit (almost) any page at any time. This was Wikipedia’s brilliant original insight, and there is no doubt it made Wikipedia what it is today. But along with scholars and other knowledge-loving contributors comes the riff raff. The calculation is that the value of good editors attracted by Wikipedia’s open-editing policy will outweigh the vandals and troublemakers. On one hand, it is an article of faith not rigorously tested. On the other hand, Wikipedia’s mere existence is proof that the bet is generally sound.

All of which is preamble to praise Wikipedia’s #1Lib1Ref project, now in its second year, for taking what is to my mind a more sensible approach to building Wikipedia’s editorship: targeting persons and professions that already have more in common with Wikipedia than they might realize, in this case librarians. Whereas the official Wikimedia vision statement calls for “a world in which every single human being can freely share in the sum of all knowledge”, the #1Lib#1Ref tagline suggests “a world where every librarian added one more reference to Wikipedia.”

This is great! As much as The Wikipedian strongly supports the big-picture goal of the vision statement, the fact is asking “every” person to contribute “all” things is no place to begin. But asking a very specific type of person to make just one contribution actually turns out to be massively more powerful because it is vastly more effective.

Speaking anecdotally, the greatest hurdle to becoming a Wikipedia contributor is figuring out how to make that very first edit.[1]The second greatest hurdle is getting that person to figure out what to do next, but that is for another day. Encouraging the determination to give it a try, and creating a simple set of steps to help them get there, will do a lot more than the sum of all lofty rhetoric.

#1Lib#1Ref runs January 15 to February 3, and you can learn more about it via The Wikipedia Library. If you decide to get involved, you should also consider posting with the obvious hashtag on Twitter or another social platform of your choice. Oh, and if you don’t get to it before February 3, I’m sure they’ll be happy to have you join in after the fact.

P.S. You have no idea how hard it was to write this without making either a Bob Marley or U2 reference. If you now have one song or the other stuck in your head, you are most welcome.

The Wikipedia Library logo by User:Heatherawalls, licensed under Creative Commons.

Notes   [ + ]

1. The second greatest hurdle is getting that person to figure out what to do next, but that is for another day.

by William Beutler at January 24, 2017 04:09 PM

January 23, 2017

Andy Mabbett (pigsonthewing)

Bromptons in Museums and Art Galleries

Every time I visit London, with my Brompton bicycle of course, I try to find time to take in a museum or art gallery. Some are very accommodating and will cheerfully look after a folded Brompton in a cloakroom (e.g. Tate Modern, Science Museum) or, more informally, in an office or behind the security desk (Bank of England Museum, Petrie Museum, Geffrye Museum; thanks folks).


Brompton bicycle folded

When folded, Brompton bikes take up very little space

Others, without a cloakroom, have lockers for bags and coats, but these are too small for a Brompton (e.g. Imperial War Museum, Museum of London) or they simply refuse to accept one (V&A, British Museum).

A Brompton bike is not something you want to chain up in the street, and carrying a hefty bike-lock would defeat the purpose of the bike’s portability.


Jack Wills, New Street (geograph 4944811)

This Brompton bike hire unit, in Birmingham, can store ten folded bikes each side. The design could be repurposed for use at venues like museums or galleries.

I have an idea. Brompton could work with museums — in London, where Brompton bikes are ubiquitous, and elsewhere, though my Brompton and I have never been turned away from a museum outside London — to install lockers which can take a folded Brompton. These could be inside with the bag lockers (preferred) or outside, using the same units as their bike hire scheme (pictured above).

Where has your Brompton had a good, or bad, reception?

Update

Less than two hours after I posted this, Will Butler-Adams, MD of Brompton, >replied to me on Twitter:

so now I’m reaching out to museums, in London to start with, to see who’s interested.

by Andy Mabbett at January 23, 2017 08:24 PM

Wikimedia Foundation

“I knew that once I started, I wouldn’t be able to stop writing”: Başak Tosun

Photo by Muzammil, CC BY-SA 4.0.

Photo by Muzammil, CC BY-SA 4.0.

Başak Tosun has been editing the Turkish Wikipedia for over a decade, and she still remembers the feeling she got when she received an email inviting her to contribute.

“The moment I read about [Wikipedia], the idea of writing encyclopedic articles sounded like fun,” Tosun recalls. “But I know myself very well. I hesitated about visiting that website. I knew that once I started, I wouldn’t be able to stop writing.”

Tosun successfully held out for a few months but eventually decided to take the plunge. Editing Wikipedia started as a simple hobby, writing articles about her favorite anime characters, before it became more when she decided to channel her efforts into filling content gaps on the Turkish Wikipedia.

“Many artists, writers and scientists were missing on Wikipedia, or not being fairly or adequately represented on the internet,” says Tosun. “I felt empowered knowing I could do something about it.”

Photo by Horace Vernet, public domain.

Massacre of the Mamluks in Cairo Citadel, 1805. Photo by Horace Vernet, public domain.

One of the 1,100 articles created by Tosun was on Ibn Taghribirdi, a historian from fifteenth century Egypt. He lived with Cairo’s Mamluk elite (the Turkish ruling class of slave origins). Ibn Taghribirdi is known for his analytic style in documenting the Mamluk rulers of Egypt and the history of Egypt during the middle ages.

Overall, Tosun has a passion for editing history and biographies and has invested much time in developing articles about the history of Turkey, women in art, musician and dancer profiles, and more. This interest has bled over into her professional life as well: “When I started editing history, I recognized that my general knowledge of it was lacking,” she said. “So I started a four-year degree program in history at an open university that I’m completing this year.”

Tosun doesn’t have a utopian view of the Wikipedia community. Mistakes occur, she notes, but assuming good faith makes them tolerable. Even editing conflicts can result in collaboration on developing a topic. “Sometimes there are conflicts on the ethnic origins of people in the biographies I write,” Tosun explains. “Most of the time, the person in question is of mixed origins, and therefore all sides of the conflict have merit. In such cases, I usually try to add as much detailed information and references as I can to support all views.”

Tosun enjoys sharing her experience with others and showing them how to contribute. According to her, “It’s always easy to start contributing to Wikipedia as long as the new user recognizes the edit button.” Based on that, she suggested that her sister, a psychology professor, assign her students editing tasks on Wikipedia as part of their syllabus. Tosun offered to help train the students on how to edit Wikipedia.

The plan worked and the next semester, another professor joined the efforts with 102 students. “Most students do not continue contributing extensively, but at least they become better readers of Wikipedia,” Tosun explains. Together with the Turkish Wikipedia community, Tosun is now helping with several Wikipedia courses in different universities.

When not on Wikipedia, Tosun works for a web hosting and domain registration company. She studied political science before going for a second degree in history.

One of her wishes for 2017, is to help organize an editathon (editing workshop) at the Poetry Library in the city where she lives, where the participants would focus on editing Turkish poet profiles.

Interview by Syed Muzammiluddin, Wikimedia Community Volunteer
Profile by Samir Elsharbaty, Digital Content Intern, Wikimedia Foundation

by Syed Muzammiluddin and Samir Elsharbaty at January 23, 2017 08:15 PM

Wiki Education Foundation

Students share linguistics with the world

While visiting the Linguistic Society of America Conference in Austin earlier this month, I asked attendees: why do you think the study of linguistics is so relevant today? Their replies were varied: the election, the rise of fake news, the importance of understanding language bias, and knowing how we use rhetoric to persuade others.

In 2016, linguistics was a topic of interest not just in academic scholarship, but also in popular culture and politics. When Arrival hit theaters in the fall, it challenged us to think about the power of language in shaping our understanding of the world — or other worlds. Throughout the year, news outlets asked us to consider the relevance of the President-elect’s rhetorical devices and speech patterns in shaping public opinion.

Here at Wiki Ed, we agree that the public’s understanding of these issues is paramount. That’s why in November 2015, just as we were starting to promote our Year of Science, we announced our partnership with the Linguistic Society of America to support students as they work to systematically improve coverage of linguistics topics on Wikipedia. And in the last year we’ve done just that.

Since the beginning of our partnership in the spring 2016 term, we’ve supported 25 courses with 348 students as they contributed to language and linguistics articles on Wikipedia. Together, the 373 articles they improved, including 13 new entries, have been viewed over 13.5 million times. These numbers further illustrate the relevance of linguistics to public conversations in 2016. That’s partly why I found myself so glad to be returning to LSA as we kick off the new year, and why Wiki Ed is so proud of our partnership. In 2017, we’d love to continue to grow our support of these classes.

Wiki Ed provides technical tools, training materials, and flexible assignment timelines to make integrating Wikipedia into your courses as simple as possible. Instructors and students also receive staff support throughout the semester. One instructor teaching with us for the second time this spring came by my booth at the conference and said “Wiki Ed’s support will save me 50 hours in prep time!” I hope you’ll join us in using Wikipedia to help the world understand the crucial work of linguists.

For more information about teaching with Wikipedia generally, visit teach.wikiedu.org. If you’d like to talk with someone about setting up an assignment in your next course, reach out at contact@wikiedu.org.

by Samantha Weald at January 23, 2017 08:00 PM

Sam Wilson

Wikisource Hangout

I wonder how long it takes after someone first starts editing a Wikimedia project that they figure out that they can read lots of Wikimedia news on https://en.planet.wikimedia.org/ — and when, after that, they realise they can also post to the news there? (At which point they probably give up if they haven’t already got a blog.)

Anyway, I forgot that I can post news, but then I remembered. So:

There’s going to be a Wikisource meeting next weekend (28 January, on Google Hangouts), if you’re interested in joining:
https://meta.wikimedia.org/wiki/Wikisource_Community_User_Group/January_2017_Hangout

by Sam Wilson at January 23, 2017 11:43 AM

Gerard Meijssen

#Wikipedia - #Sources anyone?

Sources are important. They make it obvious what is correct and what is not. For content in Wikidata Wikipedia is an important source of information. It aims to be neutral and there are loads of sources.

When you bring the information together in a tree like the one to the right, it follows that all the information has to agree with that interpretation. It all starts with "Duqaq Temür Yalığ" but he is called "Toqaq" in the article on Seljuq.

The article on the Seljuk Empire is quite wonderful because it includes the spouses of the Sultans and their lineage. Really relevant to understand the politics of the time.

I do include information where I can find and understand it. Quite often, information is problematic. Sometimes it is obviously wrong as in attributing a person to a modern country. As more data is entered, the information becomes more complicated and coherent. Errors become more glaringly obvious. It becomes more and more a matter of adding individual statements that are the difference and not so much long lists of data.

At some stage the puzzles will be left and sources will need to be sought to make the right statements, not the obvious statements.
Thanks,
       GerardM

by Gerard Meijssen (noreply@blogger.com) at January 23, 2017 10:34 AM

Andre Klapper

Wikimedia in Google Code-in 2016

(Google Code-in and the Google Code-in logo are trademarks of Google Inc.)

Google Code-in 2016 has come to an end. Wikimedia was one of the 17 organizations who took part to offer mentors and tasks to 14-17 year old students exploring free and open source software projects via small tasks.

Congratulations to our 192 students and 46 mentors for fixing 424 tasks together!

Being one of the organization admins, deciding on your top five students at the end of the contest always takes time and discussions as many students have provided impressive work and it hurts to have to put a great contributor on the 6th or 7th place.
Google will announce the Grand Prize winners and finalists on January 30th.

Reading the final feedback of students always re-assures that all the effort mentors and organization admins put into GCI are worth it:

  • In 1.5 month, I learned more than in 1.5 year. — Filip
  • I know these things will be there forever and it’s a big thing for me to have my name on such a project as MediaWiki. — Victor
  • What makes kids like me continue a work is appreciation and what the community did is give them a lot. — Subin
  • I spent my best time of my life during the contest — David

Read blogposts by GCI students about their experience with Wikimedia.

To list some of the students’ achievements:

  • Many improvements to Pywikibot, Kiwix (for Wikipedia offline reading), Huggle, WikiEduDashboard, Wikidata, documentation, …
  • MediaWiki’s Newsletter extension received a huge amount of code changes
  • The Pageview API offers monthly request stats per article title
  • jQuery.suggestions offer reason suggestions to block, delete, protect forms
  • A {{PAGELANGUAGE}} magic word was added
  • Changes to number of observations in the Edit Quality Prediction model
  • A dozen MediaWiki extension pages received screenshots
  • Lots of removal of deprecated code in MediaWiki core and extensions
  • Long CREDIT showcase videos got split into ‘one video per topic’ videos on Wikimedia Commons
  • Proposals for a redesign of the Romanian Wikipedia’s main page
  • Performance improvements to the importDump.php maintenance script
  • Converted Special:RecentChanges to use the OOUI library
  • Allow users to apply change tags as they make logged actions using the MediaWiki web API
  • Added some hooks to Special:Unblock
  • Added a $wgHTTPImportTimeout setting for Special:Import
  • Added ability to configure the web service endpoint and added phpcs checks in MediaWiki’s extension for Ideographic Description Sequences
  • Glossary wiki pages follow the formatting guidelines
  • Research on team communication tools

We also received valuable feedback from our mentors on what we can improve for the next round.

Thanks to everybody for your friendliness, patience, and help provided.
Thanks for your contributions to free software and free knowledge.
See you around on IRC, mailing lists, tasks, and patch comments!

by aklapper at January 23, 2017 04:47 AM

January 21, 2017

Andy Mabbett (pigsonthewing)

Four Stars of Open Standards

I’m writing this at UKGovCamp, a wonderful unconference. This post constitutes notes, which I will flesh out and polish later.

I’m in a session on open standards in government, convened by my good friend Terence Eden, who is the Open Standards Lead at Government Digital Service, part of the United Kingdom government’s Cabinet Office.

Inspired by Tim Berners-Lee’s “Five Stars of Open Data“, I’ve drafted “Four Stars of Open Standards”.

These are:

  1. Publish your content consistently
  2. Publish your content using a shared standard
  3. Publish your content using an open standard
  4. Publish your content using the best open standard

Bonus points for:

  • making clear which standard you use
  • publishing your content under an open licence
  • contributing your experience to the development of the standard.

Point one, if you like is about having your own local standard — if you publish three related data sets for instance, be consistent between them.

Point two could simply mean agreeing a common standard with other items your organisation, neighbouring local authorities, or suchlike.

In points three and four, I’ve taken “open” to be the term used in the “Open Definition“:

Open means anyone can freely access, use, modify, and share for any purpose (subject, at most, to requirements that preserve provenance and openness).

Further reading:

by Andy Mabbett at January 21, 2017 03:13 PM

Gerard Meijssen

#Wikipedia - Support understanding the #gender gap

#Wikidata needs to mature. #Wikipedia needs to mature. They both have wishes they aim to fulfil that escapes them. The gender gap is such an issue and it can be used to illustrate how both will mature when they cooperate.

When you want to know how many articles are expected to be written at a given point you need to analyse the red links. They indicate articles that are likely notable and indicate a structural need in Wikipedia. To do that you need data and you need a tool.

When links exist for every red link to an item in Wikidata, you have both the data and a tool. This will help Wikipedia with its disambiguation, and it will show up what a Wikipedia is missing. It is a tool that may drive people to write articles about the missing links.

All the red links will now link to Wikidata and articles in other Wikipedias. It also allows for people to add statements to Wikidata so that facts about those items are known. For instance that it is about a woman. When statements to awards, professions and events are known, there is added weight to write an article.

In this way two purposes are served; researchers have better tools that help them understand the gender gap and it will help people who care about he gender gap work on reducing that gap.

Technically it is not that complicated to achieve. If there is a problem with this proposal it may be that Wikipedians need to understand that this is not a power grab but a way to improve quality and efficiency of their project.
Thanks,
      GeratdM

by Gerard Meijssen (noreply@blogger.com) at January 21, 2017 08:51 AM

January 20, 2017

Wiki Education Foundation

Monthly Report for December 2016

Highlights

  • At the end of the Wikipedia Year of Science, we tallied the contributions our 287 science courses contributed: 4.93 million words added to 5,640 articles, including 622 new entries, viewed 270 million times just during their respective terms. That means we added the equivalent of 11% of the last print edition of Encyclopedia Britannica to science content on Wikipedia during the Year of Science.
  • Our fall term wrapped up in December, with us supporting more than 6,300 students in 276 courses. In the fall term, student editors added 4.2 million words of content across all disciplines, providing better content for 253 million readers.
  • We announced new Wikipedia Visiting Scholars that will be working with the University of San Francisco’s Department of Rhetoric and Language and San Francisco State University’s Paul K. Longmore Institute on Disability.
  • We released an addition to our series of subject-specific editing brochures: Editing Wikipedia articles on Political Science.

Programs

Educational Partnerships

Samantha Weald attends the American Geophysical Union conference in San Francisco
Samantha Weald attends the American Geophysical Union conference in San Francisco

In December, Outreach Manager Samantha Weald, Classroom Program Manager Helaine Blumenthal, Director of Programs LiAnna Davis, and Educational Partnerships Manager Jami Mathewson attended the final academic conference during the Wikipedia Year of Science. At the American Geophysical Union’s annual meeting in San Francisco, staff members met earth scientists eager to improve Wikipedia’s content. At the conference, we spoke to dozens of scientists who believe Wikipedia is a valuable website for them, their students, and the world. We’re excited to bring more geophysics, geology, and earth science students to Wikipedia in the coming years, helping us amplify the impact of this year’s Wikipedia Year of Science.

As we wrapped up another year of recruitment, we reflected on our aim to increase the Wiki Education Foundation’s visibility to university and college instructors in the United States and Canada. Over the course of the year, we attended 23 conferences to share Wiki Ed’s mission with university instructors. We also made 12 campus visits, where Wiki Ed’s program participants hosted us to encourage their colleagues to join our efforts. Additionally, we hosted four outreach webinars. Through these outreach initiatives, we brought more instructors than ever into the Classroom Program, supporting a record 515 courses and nearly 11,000 students in 2016.

Classroom Program

Status of the Classroom Program for Fall 2016 in numbers, as of December 31:

  • 276 Wiki Ed-supported courses were in progress (130 or 47%, were led by returning instructors).
  • 6,307 student editors were enrolled.
  • 60% of students were up-to-date with the student training.
  • Students edited 5,700 articles, created 722 new entries, and contributed 4.18 million words.

The Fall 2016 term has come to a close, and we’re busily preparing for Spring 2017. Our most successful term to date was defined by growth, productivity, and experimentation. With 276 courses doing Wikipedia assignments, the Classroom Program has grown to nearly triple the size it was in Fall 2014. And of course with this rapid growth, our students are having an even greater impact on Wikipedia. To ensure that all of our instructors and students get the support they need, we implemented several new programs during the Fall 2016 term, including a series of interactive webinars and a more robust help section built into the Dashboard.

While we’re proud of the above numbers, the true success of the Classroom Program is, in some ways, immeasurable. As recent events have demonstrated, fake news poses a serious threat to an informed citizenry. Students who learn how to contribute to Wikipedia are not only making reliable information available to the public at large, they are also developing critical media literacy skills that enable them to discern real from fake sources of information. In learning Wikipedia’s strict policies around sourcing, our students know to question headlines and to dig deeper. These are lifelong skills that not only serve our students, but society more generally.

The close of 2016 also marked the end of the Wikipedia Year of Science. During this year-long initiative, we strove to improve Wikipedia content in STEM and social science fields, while developing critical science communication skills among our students. Our Year of Science campaign consisted of 287 courses and 6,270 students. Together, they contributed 4.93 million words to 5,640 Wikipedia articles, including 622 new entries, and their work was viewed 270 million times in the spring and fall terms alone. A specific goal of the Year of Science was to improve Wikipedia’s coverage of women scientists, and our students either expanded or created well over 100 articles on important but overlooked women in the sciences. While the Year of Science has come to an end, we recognize that our work in this area has, in many ways, only just begun. Science literacy, along with media literacy, are key components of an accurately informed society, and we will continue to prioritize both going forward.

Angel_food_cake_with_strawberries_(4738859336)
The Wikipedia article on angel food cake was among those improved in Richard Ludescher’s Food Physical Systems class at Rutgers University.
Image: Angel food cake with strawberries by F_A, CC BY 2.0, via Wikimedia Commons.

We saw some great work from several courses:

When we think about food, we think about taste, preparation, and whether it’s healthy or not. We rarely think about things like hydrogen bonding, electrostatic interactions, or Van der Waals interactions. But these are important aspects, and thanks to students in Richard Ludescher’s Food Physical Systems class at Rutgers University, information of this sort is now available through a number of Wikipedia articles. A student in the class expanded the angel food cake article by adding sections about the manufacturing process, the ingredients used in commercial production, and the physical and biochemical roles played by these ingredients in the final product. Another student expanded the croissant article by adding information about their manufacturing and the changes in the physical and chemical properties of ingredients during manufacturing, baking, and storage. Other students added information on the physical and chemical properties of a number of other foods including marshmallows, mayonnaise and chewing gum. The expansion of the meat analogue article added information about the composition, processing, and physical structure of the product that are required to mimic the texture and taste of meat.

Students in Glenn Dolphin’s Introductory Geology class continued their work expanding biographies of women geologists. Maria Crawford contributed to lunar petrology, continental collisions (on earth) and the geology of the Pennsylvania Piedmont, but at the beginning of the term her Wikipedia biography was only four sentences long and said nothing of her contributions. A student turned that stub into a substantial article which documented her achievements from a career that spanned four decades. Another student created an article on Virginia Harriett Kline, a a stratigrapher who earned a Ph.D. in geology in 1935 and made important contributions to petroleum geology. Other students in the class continued to expand the articles they had worked on earlier in the term.

Early studies of child psychology often focused on conflict and aggression. Lois Barclay Murphy chose instead to focus on normal childhood development; she played an important role in the development of that field. Marvin Zuckerman played an important role in the development of the field of sensation seeking. Mary K. Rothbart is an expert on infant temperament development. Rena R. Wing is an expert on the behavioral treatment of obesity. None of these psychologists had biographies on Wikipedia. Similarly, the field of geriatric psychology had been omitted. These were among the articles created by students in James Council’s History and Systems of Psychology class at North Dakota State University. Other students worked to expand existing articles, like the profile of mood states, which was expanded from a short stub into a substantial article.

One of Wiki Ed’s great successes has been recruiting professors in archaeology and anthropology to expand and improve articles on archaeological sites, artifacts and methods. There are thousands of sites which are notable but not covered in Wikipedia — students in courses like Rice University’s African Prehistory have added to articles like Manyikeni in Mozambique and KM2 and KM3 in Tanzania. They’ve also updated the article on South Africa’s Border Cave, an already substantial article which now covers more modern work on the site and artifacts found within it.

Critical theory is hard work. Explaining it to laypeople is even harder. Doing so on Wikipedia, harder still. The language of critical theory (in nearly any discipline: law, economics, feminism) is often disjoint from or at odds with the main voices in the discipline — otherwise it’s hard to say it’s critical! Students in John Willinsky’s Critical Theory and Pedagogies outdid themselves in added to Wikipedia’s coverage of critical mathematics pedagogy, critical pedagogy (a hard phrase to hear for the policy debate veterans in our audience), and expanding coverage of books like Learning to Labour, a critical educational ethnography. Work on narrow, difficult topics like critical pedagogy of place requires research and preparation and the students’ work speaks to the hard work they’ve done.

Finally, interim Content Expert Rob Fernandez, who graciously agreed to join our staff temporarily to help out with the rush at the end of the fall term, wrapped up his contract with Wiki Ed in December. Rob’s help to ensure our student editors and instructors got top-notch support was invaluable. Thank you for your contributions, Rob, and best of luck on your new job!

Community Engagement

Blausen_0088_BloodClot
Barbara Page’s article about thrombosis prevention explains treatments to prevent the formation of blood clots inside a blood vessel.
Image: Blausen 0088 BloodClot.png by Blausen.com staff, CC AT 3.0, via Wikimedia Commons.

Community Engagement Manager Ryan McGrady announced two new Visiting Scholars positions at the beginning of this month, which got their start at the very end of last month and are already using institutional resources to improve Wikipedia. User:Lingzhi partnered with the University of San Francisco to improve rhetoric and language topics, and Jackie Koerner is developing articles on disability, such as disability in the United States, with the Paul K. Longmore Institute on Disability at San Francisco State University.

Existing Scholars continued to produce great work. George Mason University’s Gary Greenbaum had another article achieve the impressive Featured Article designation, Alabama Centennial half dollar. The University of Pittsburgh’s Barbara Page built up her portfolio of impressive medical editing with substantial improvements to Wikipedia’s entry for thrombosis prevention.

The community is getting ready to start the 11th annual WikiCup competition, in which experienced editors are awarded points for producing high-quality content. For the 2016 event that just ended, Wiki Ed sponsored a side competition with prizes for the two users with the most Good Article and Featured Articles on scientific topics. In first place was also the overall winner of the competition, User:Casliber, who developed articles like the violet webcap mushroom and the Lynx constellation. In second place, despite sitting out for the final round of the competition, was User:Cwmhiraeth, who improved some very big topics like millipede and habitat.

Program Support

Editing_Wikipedia_articles_on_Political_Science_(Wiki_Ed).pdf
Our newest subject-specific editing brochure will help students working on political science articles.

Communications

LiAnna Davis has been working with San Francisco-based media firm PR & Company to pitch stories to national press about impact Wiki Ed’s programs are having, and especially the impact of the Year of Science.

We announced the newest in our series of subject-specific editing brochures in December: Editing Wikipedia articles on Political Science. Thanks to the Wikipedia editors and partners at the MPSA who provided review and/or feedback for this.

Blog posts:

External media:

Digital Infrastructure

Continuing with the main technical focuses from last month, Product Manager Sage Ross spent December focused on collaboration and mentorship, as well as working on bug fixes and feature development.

Sejal Khatri started her Outreachy internship this month to improve the Dashboard’s user profile pages. She’s already made considerable progress toward our plans for these profile pages, and she’s made several improvements and bug fixes for course pages as well. Check out Sejal’s latest post on her internship blog to see what she’s been up to. December was also a busy month for the high school students participating in Google Code-In. Sage has been mentoring them on Dashboard tasks, including some performance and accessibility improvements, documentation and testing, bug fixes, and new features to help Wiki Ed staff handle new courses more efficiently. This month saw contributions to Wiki Ed’s codebase from nine developers outside of our staff and contractors — a new record.

Sage developed the initial version of an Article Viewer tool that lets you see a full Wikipedia article — as it looks on Wikipedia — without leaving the Dashboard. The Article Viewer is currently available alongside the Diff Viewer when you zoom in on a particular edited article in a course’s Articles tab.

In anticipation of increased Dashboard usage in 2017, in late December — just before Christmas weekend — we migrated the software to a more powerful server. The ensuing 20 minutes were the only downtime for the Dashboard during the entire 2016 term (although a handful of network disruptions and problems with Wikimedia servers did affect Dashboard users at earlier points in the term).

Research and Academic Engagement

In December, Research Fellow Zach McDowell completed the 13 focus groups portion of the research program. A total of 475 minutes of focus group recordings were sent away for transcription, resulting in more than 250 pages of text for analysis.

Survey research participation continued to grow, with more than 1,200 responses in the pre-assessment as well as more than 850 responses for the post-assessment. Surveys close on January 17, 2017.

Zach spent the remainder of his time beginning preliminary assessment of the data and cleanup plan. Additionally, Zach has been seeking out a graduate student to engage as a data science intern to expedite the analysis process of the data.

Finance & Administration / Fundraising

Wiki_Education_Foundation's_San_Francisco_team_holiday_party
Wiki Education Foundation’s San Francisco team holiday party.

Finance & Administration

To celebrate the holiday season, San Francisco based staff gathered at LiAnna’s house for a holiday party. Executive Director Frank Schulenburg led the group in the creation and consumption of a Feuerzangenbowle, and we enjoyed dinner and games.

For the month of December, expenses were $157,772 versus the approved budget of $206,733. The majority of the $49k variance continues to be due to staffing vacancies ($13k); as well as the timing of outside professional services ($22k), and the printing ($11k) expenses.

Wiki Ed Expenses 2016-12
Expenses December 2016 Actual vs. Plan

Our year-to-date expenses of $900,208 was also less than our budgeted expenditures of $1,196,085 by $296k. Like the monthly variance, the year-to-date variance was also largely impacted by staffing vacancies ($100k). In addition, the timing and deferral of professional services ($69k); marketing and cultivation ($18k); volunteer workshops ($13k); and printing ($18k); as well as savings in staffing related expenses ($16k) and in travel ($61k) contributed to the variance.

Fundraising

  • Wiki Ed Expenses 2016-12 YTD
    Expenses Year to Date December 2016 Actual vs. Plan

    Wiki Ed conducted its first–ever individual donor acquisition mailing, which reached more than 11,000 individuals. Appeals were sent via U.S. Mail in late December.

  • Google renewed their support with a $20,000 gift to Wiki Ed.

Office of the ED

Current priorities:

  • Securing funding
  • Developing a plan for next fiscal year
  • Working with the board on additional funding options

In order to be able to share an early outline of our future programmatic work with the board, Frank traditionally starts brainstorming ideas with senior staff in December. That’s why this month we embarked on thinking about the general direction for the upcoming fiscal year 2016–17 and developed a vision for the time ahead to be shared with the board on January 28–29.

Frank also started conversations with existing and prospective funders on some initiatives that are in our project pipeline for 2017. These conversations – as well as our projections of the expected impact – will inform our roadmap for the upcoming year and beyond.

Also in December, Frank prepared a series of documents for an ad hoc board taskforce that will look into additional funding streams prior to the in-person board meeting at the end of January. The board taskforce meetings will start next month via video conference with the goal of coming up with a recommendation to the board as a whole.

 

Visitors and guests

  • Merrilee Proffitt, OCLC
  • Steve Kaplan, Message LA

by Ryan McGrady at January 20, 2017 11:56 PM

Content Translation Update

January 20 CX Update: More fixes for page loading and template editor

Hello, and welcome to another CX update post, in which I am happy to report about several significant bug fixes.

  • Pages that had full stops (⟨.⟩) in headings couldn’t be loaded after auto-saving and closing the browser tab. This is now fixed. It’s a follow-up to a similar bug a fix for which was reported last week. If you still have issues with loading saved pages, please report them. (bug report)
  • Adapted infoboxes would often say “Main Page” on the top, no matter what was the page being translated, or into what language. This could also happen with other kinds of templates. This affected pages with templates that used that {{PAGENAME}} magic word. This is now fixed, and the auto-adapted template now shows the relevant page name. (bug report)
  • An unnecessary horizontal scrollbar was shown on some pages that had wide tables. It was removed. (code change)

 


by aharoni at January 20, 2017 07:50 PM

Wiki Education Foundation

Rediscovering the “higher” in higher education with a Wikipedia writing assignment

Dr. Joel Parker is Associate Professor of Biological Sciences at SUNY Plattsburgh, where he has incorporated Wikipedia into his Cell Biology courses. In November we featured some of the great work his students did in our Cell Service roundup. In this post, he explains how assigning students to contribute to Wikipedia brings them through the process of discovery.

Joel Parker
Joel Parker

Assigning students to write for Wikipedia achieves the highest outcome of higher education by teaching your students the full process of discovery. This lesson is especially important today as higher education is being debased with lower learning outcomes that overemphasize the practical training of our students for the workplace. What makes higher education “higher” is the opportunity for students to work with scholars to learn how to advance both our own knowledge and knowledge within our scholarly disciplines. Writing for Wikipedia can facilitate the transition from passive learner to active discoverer for your senior students. I make this happen for my senior level Cell Biology class with a writing for Wikipedia assignment that requires my students to go through all of the stages of the academic discovery process.

Discovery is the general common objective that defines higher education. This discovery process happens at three levels at universities. The first level of discovery is students discovering for themselves previously learned knowledge about the world. This is mastering the material and background knowledge that one expects of a degree holder. The next level is students and academics working together to discover truths about how the universe works. It involves noticing a gap or flaw in our current knowledge, then imagining and proving a solution. Finally, and no less important, the third level is a personal version of the second: effectively contributing the answer to the discipline and the world by communicating the discovery. This personal discovery is the transformation required by our students to gain the confidence to transition from just being beneficiaries of knowledge, to becoming propagators and contributors of new knowledge. This third level is especially important to higher education as the teens and early twenties are perhaps the most formative years when our adult personalities and sense of selves are formed.

In my senior level cell biology class I have my students do each step of the discovery process in a Wikipedia writing assignment. The first step begins when I assign my students to search for and critique an existing cell biology Wikipedia article. This means finding mistakes, missing sections, and places where they can improve the article. The next step is actually doing the fixes and creating new content to fill the voids. This technical side includes writing in the encyclopedic style, communicating science at the correct level, and can even include graphic design when figures are called for. The final step is publicly publishing the article in the correct format and style, then dealing with the judgments, suggestions and edits from the rest of the community. I constantly remind my students throughout the assignment that the overriding goal and assessment criteria is their contribution through improving the articles. It is not sufficient to just write the minimum number of sentences and put in some number of new citations. The changes must improve the article and the citations have to be citations that others will genuinely find helpful or else they do not count. All academics will instantly recognize this outlined process as exactly what we do in our own intellectual work from identifying a question, answering the question, and publishing the solution with peer review. The goal, and the measurable outcome for the students that put in the effort, are articles significantly improved in some way.

With writing for Wikipedia, my students have the opportunity to experience the personal transition from being beneficiaries of their most used and appreciated reference source, to becoming contributors to that source. The objective is for them to become confident enough to see themselves as experts with the ability to contribute and improve the world with what they have learned from their university education. They experience this directly because their work is not just about their grade, but also clearly beneficial to future students like themselves who will be using the edited pages. Thus the assignment forces a maturing change in perspective. Even the most incremental of improvements means the world is different and better thanks to the application of their education to the world’s largest and most used encyclopedia.

Facilitating and advancing discovery is what defines higher education. Wikipedia writing assignments are the one of the best ways to teach, and to remind ourselves, of that primary learning outcome.

If you’d like to learn more about how to incorporate Wikipedia into your course, visit teach.wikiedu.org or send us an email at contact@wikiedu.org.

Photo: Dr. Joel Parker.jpg, by Joel Parker, CC BY-SA 4.0, via Wikimedia Commons.

by Guest Contributor at January 20, 2017 06:02 PM

Wikimedia UK

Wikidata: the new hub for cultural heritage

This article is by: Dr Martin Poulter, Wikimedian In Residence at the University of Oxford – This post was originally published on the Oxford University Museums blog.

There is a site that lets users create customised and unusual lists of art works: works of art whose title is an alliteration, self-portraits by female artists, watercolour paintings wider than they are tall, and so on. These queries do not use any gallery or museum’s web site or search interface but draw from many collections around the world. The art works can be presented in various ways, perhaps on a map of locations they depict, or in a timeline of their creation, colour-coded by the collection where they are held. The data are incomplete, but these are the early days of an ongoing and ambitious project to share data about cultural heritage—all of it.

Judith_with_the_head_of_Holofernes
Judith with the head of Holofernes, Self Portrait (1610s) Fede Galizia, John and Mable Ringling Museum of Art

Wikimedia is a family of charitable projects that are together building an archive of human knowledge and culture, freely shareable and reusable by anyone for any purpose. Wikipedia, the free encyclopedia, is only the best-known part of this effort. Wikidata is a free knowledge base, with facts and figures about tens of millions of items. These data are offered as freely as possible, with no restriction at all on their copying and reuse.

Already, large amounts of data about artworks are being shared by formal partnerships. The University of Barcelona have worked with Wikimedians to share data about Art Nouveau works, recognising that it is far better to have all these data in one place than scattered across various online and offline sources. The National Library of Wales has employed a Wikidata Visiting Scholar to share data about its artworks, including the people and places they depict. The Finnish National Gallery, the Rijksmuseum in Amsterdam and the National Galleries of Scotland are among the institutions who have either formally uploaded catalogue data to Wikidata, or made data freely available for import. To see the sizes of these shared catalogues, one just has to ask Wikidata.

Wikidata logo – Image CC BY-SA 3.0

Wikidata queries can be built using SPARQL, a database query language not for the faint-of-geek. However, there is an open community of users sharing and improving queries. The visualisations they create can be shared online or embedded inside other sites or apps. Developers can build applications for the public; easy to use, but offering a distinctive view of Wikidata’s web of knowledge.

One such application is Crotos, a family of tools generating image galleries and maps of art, filtered by format, artist, place depicted and other attributes. Crotos shows images of the art, so it only includes works with a digital image available in Wikimedia Commons. Wikidata itself has no such restriction: it describes art whether or not a freely-shareable scan is available.

So while the Wikidata site itself might not have mass appeal, the service it provides is gradually transforming the online world, providing a single source of data for some of the most popular web sites and apps. Those “infoboxes” summarising key facts and figures at the top of Wikipedia articles are increasingly being driven from Wikidata, so dates, locations and other facts can be entered in one place but appear on hundreds of sites.

The really exciting prospect is that of building visualisations and other interactive educational objects, integrating information from many collections and other data sources. Wikidata would be interesting enough as an art database, but it also shares bibliographic, genealogical, scientific, and other kinds of data, covering modern as well as historical topics. This allows combined queries, such as art by people born in a particular region and time period, or works depicting people described in a particular book.

Wikidata is massively multilingual, using language-independent identifiers and connecting these to names in hundreds of languages as well as to formal identifiers. In a way it is the ultimate authority file; a modern Rosetta Stone connecting identifiers from institutions’ authority files, scholarly databases and other catalogues (Hinojo (2015)).

There are thousands of properties that a Wikidata item can have. Just considering a small selection that are relevant to art and culture, it is clear that the number of possible queries is astronomical.

  • Many features of an art work can be described:
    • instance of: in other words, the type. Wikidata has many types to choose from, from oil sketch and drawing, via architectural sculpture and stained glass, to aquatint and linocut
    • collection
    • material used
    • height, width
    • genre, movement
    • co-ordinates of the point of view
  • People and places can be connected to an artwork: depicts, creator, attributed to, owned by, after a work by, commissioned by.
  • There are relations between people: parent, sibling, influenced by, school of, author and addressee of a letter.
  • People can also be connected to groups or organisations: member of, founder, employer, educated at.

With so many kinds of data, Wikidata draws in volunteer contributors with varying interests. Just as there are people who will sit down for an evening to improve a Wikipedia article or to categorise images on Wikimedia Commons, there are people fixing and improving Wikidata’s entries and queries. As with Wikipedia, Wikidata benefits from the intersection of different interests. Contributors speak different languages and have different background knowledge. Some are interested in a particular institution’s collection, while others are interested in a particular style of art, others in a given location or historic individual. Hence one entry can attract multiple contributors, each motivated by a different interest.

Over time, Wikidata’s role in Wikipedia will expand. Explore English Wikipedia and you find many list articles, such as List of works by Salvador Dalí or List of Hiberno-Saxon illuminated manuscripts. At the moment, these are all manually maintained, but a program—the ListeriaBot—has been created to turn Wikidata queries into lists suitable for Wikipedia: see for example this (draft) list of paintings of art galleries. Catalan Wikipedia, with a much smaller contributor base than the English language version, is already using the bot to write list articles such as Works of Jacob van Ruisdael, saving many hours of human effort. As automated creation of list articles becomes more widespread, cultural institutions that share catalogue data will help ensure the correctness and completeness of these articles.

Un paisatge del riu amb figures, by Jacob Van Ruysdael (1628/1629–1682), Museu de Belles Arts Puixkin

Like Wikipedia, Wikidata depends on Verifiability: any statement of fact is expected to cite or link a credible published source. Hence it has active links to catalogues and other formally vetted sites, which usually supply more scholarly detail and primary research than Wikidata itself. So Wikidata is not a replacement for cultural institutions’ catalogues. The hub metaphor is apt: it is a central point, linking together disparate resources and giving them a useful shape. Its credibility will always depend on the formally vetted sources that it cites, and there will always be users who want to check what they read by following up the citations. In practice, this means that sharing ten thousand records with Wikidata is a way to get ten thousand incoming links to the institution’s own catalogue. What’s more, the free reuse of Wikidata means that other sites will use those links.

Wikidata and its partners have a huge task ahead of them, but the potential reward is vast. We could have data on all artworks, browsable in endless and genuinely new ways, with connections to their official catalogues, their physical locations, and scholarly literature. The sooner the cultural sector as a whole gets involved, the sooner we can bring this about.

References

Note

I am grateful to Wikidata users Jane Darnell (User:Jane023), Magnus Manske (User:Magnus Manske – creator of User:ListeriaBot) and Andy Mabbett (User:Pigsonthewing) for many of the useful links in this article.

 

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International Licence.

by Martin Poulter at January 20, 2017 12:49 PM

User:Geni

The Canon EF 11-24mm f/4 for Wikipedians

Its a £2700 lens. At that price I suspect anyone buying it can come to their own conclusions. Still on a full frame camera it is an extremely useful lens. The width makes it great for urban architecture, larger items in museums, and interiors in general. The short minimum focus distance makes it great for objects in cases and the lens’s sharpness makes it viable to crop the resulting images.

Obviously if you want to shoot longer than 24mm then you need another lens but for wide angle work the lens is excellent.

Downsides. Its a £2700 lens. You could buy quite a lot of other gear for that. The Sigma 12-24mm f/4 is about £1000 cheaper and nearly as sharp at the wide end. If you are shooting on a crop sensor then the 10-18mm is under £300 so unless you really really need the sharpness for some reason I wouldn’t go near this lens for a crop system. On top of that its big and its heavy. Not something I have an issue with but for anyone more weight conscious (but then why shoot full frame?) it may present a problem. The f/4 speed may be less than idea for indoor work but thats becoming less and less of a problem as camera low light abilities improve.

Overall a very useful bit of kit but also really rather on the expensive side.


by geniice at January 20, 2017 11:48 AM

Shyamal

The many shades of citizen science

Everyone is a citizen but not all have the same kind of grounding in the methods of science. Someone with a training in science should find it especially easy to separate pomp from substance. The phrase "citizen science" is a fairly recent one which has been pompously marketed without enough clarity.

In India, the label of a "scientist" is a status symbol, indeed many actually proceed on paths just to earn status. In many of the key professions (example: medicine, law) authority is gained mainly by guarded membership, initiation rituals, symbolism and hierarchies. At its roots, science differs in being egalitarian but the profession is at odds and its institutions are replete with tribal ritual and power hierarchies.

Long before the creation of the profession of science, "Victorian scientists" (who of course never called themselves that) pursued the quest for knowledge (i.e. science) and were for the most part quite good as citizens. In the field of taxonomy, specimens came to be the reliable carriers of information and they became a key aspect of most of zoology and botany. After all what could you write about or talk about if you did not have a name for the subject under study. Specimens became currency. Victorian scientists collaborated in various ways that involved sharing information, sharing /exchanging specimens, debating ideas, and tapping a network of friends and relatives for gathering more "facts". Learned societies and their journals helped the participants meet and share knowledge across time and geographic boundaries.  Specimens, the key carriers of unquestionable information, were acquired for a price and there was a niche economy created with wealthy collectors, not-so-wealthy field collectors and various agencies bridging them. That economy also included the publishers of monographs, field guides and catalogues who grew in power along with organizations such as  museums and later universities. Along with political changes, there was also a move of power from private wealthy citizens to state-supported organizations. Power brings disparity and the Victorian brand of science had its share of issues but has there been progress in the way of doing science?

Looking at the natural world can be completely absorbing. The kinds of sights, sounds, textures, smells and maybe tastes can keep one completely occupied. The need to communicate our observations and reactions almost immediately makes one need to look for existing structure and framework and that is where organized knowledge a.k.a. science comes in. While the pursuit of science might seem be seen by individuals as being value neutral and objective, the settings of organized and professional science are decidedly not. There are political and social aspects to science and at least in India the tendency is to view them as undesirable and not be talked about so as to appear "professional".  

Being silent so as to appear diplomatic probably adds to the the problem. Not engaging in conversation or debate with "outsiders" (a.k.a. mere citizens) probably fuels the growing label of "arrogance" applied to scientists. Once the egalitarian ideal of science is tossed out of the window, you can be sure that "citizen science" moves from useful and harmless territory to a region of conflict and potential danger. Many years ago I saw a bit of this  tone in a publication boasting the virtues of Cornell's ebird and commented on it. Ebird was not particularly novel to me (especially as it was not the first either by idea or implementation, lots of us would have tinkered with such ideas, even I did with - BirdSpot - aimed to be federated and peer-to-peer - ideally something like torrent) but Cornell obviously is well-funded. I commented in 2007 that the wording used sounded like "scientists using citizens rather than looking upon citizens as scientists", the latter being in my view the nobler aim to achieve. Over time ebird has gained global coverage, but has remained "closed" not opening its code or discussions on software construction and by not engaging with its stakeholders. It has on the other hand upheld traditional political hierarchies and processes that ensure low-quality in parts of the world where political and cultural systems are particularly based on hierarchies of users. As someone who has watched and appreciated the growth of systems like Wikipedia it is hard not to see the philosophical differences - almost as stark as right-wing versus left-wing politics.

Do projects like ebird see the politics in "citizen-science"?
Arnstein's ladder is a nice guide to judge
the philosophy behind a project.
I write this while noting that criticisms of ebird as it currently works are slowly beginning to come out (despite glowing accounts in the past). There are comments on how it is reviewed by self-appointed police  (it seems that the problem seems to be not just in the appointment - indeed why could not have the software designers allowed anyone to question any record and put in methods to suggest alternative identifications - gather measures of confidence based on community queries and opinions on confidence measures), there are supposedly a class of user who manages something called "filters" (the problem here is not just with the idea of creating user classes but also with the idea of using manually-defined "filters", to an outsider like me who has some insight in software engineering poor-software construction is symptomatic of poor vision, guiding philosophy and probably issues in project governance ), there are issues with taxonomic changes (I heard someone complain about a user being asked to verify identification - because of a taxonomic split - and that too a split that allows one to unambiguously relabel older records based on geography - these could have been automatically resolved but developers tend to avoid fixing problems and obviously prefer to get users to manage it by changing their way of using it - trust me I have seen how professional software development works), and there are now dangers to birds themselves. There are also issues and conflicts associated with licensing, intellectual property and so on. Now it is easy to fix all these problems piecemeal but that does not make the system better, fixing the underlying processes and philosophies is the big thing to aim for. So how do you go from a system designed for gathering data to one where you want the stakeholders to be enlightened. Well, a start could be made by first discussing in the open.

I guess many of us who have seen and discussed ebird privately could have just said I told you so, but it is not just a few nor is it new. Many of the problems were and are easily foreseeable. One merely needs to read the history of ornithology to see how conflicts worked out between the center and the periphery (conflicts between museum workers and collectors); the troubles of peer-review and open-ness; the conflicts between the rich and the poor (not just measured by wealth); or perhaps the haves and the have-nots. And then of course there are scientific issues - the conflicts between species concepts not to mention conservation issues - local versus global thinking. Conflicting aims may not be entirely solved but you cannot have an isolated software development team, a bunch of "scientists" and citizens at large expected merely to key in data and be gone. There is perhaps a lot to learn from other open-source projects and I think the lessons in the culture, politics of Wikipedia are especially interesting for citizen science projects like ebird. I am yet to hear of an organization where the head is forced to resign by the long tail that has traditionally been powerless in decision making and allowing for that is where a brighter future lies. Even better would be where the head and tail cannot be told apart.

Postscript: 

There is an interesting study of fieldguides and their users in Nature - which essentially shows that everyone is quite equal in making misidentifications - just another reason why ebird developers ought to just remove this whole system creating an uber class involved in rating observations/observers.

23 December 2016 - For a refreshingly honest and deep reflection on analyzing a citizen science project see -  Caroline Gottschalk Druschke & Carrie E. Seltzer (2012) Failures of Engagement: Lessons Learned from a Citizen Science Pilot Study, Applied Environmental Education & Communication, 11:178-188.
20 January 2017 - An excellent and very balanced review (unlike my opinions) can be found here -  Kimura, Aya H.; Abby Kinchy (2016) Citizen Science: Probing the Virtues and Contexts of Participatory Research Engaging Science, Technology, and Society 2:331-361.

by Shyamal L. (noreply@blogger.com) at January 20, 2017 04:49 AM

Wikimedia Foundation

Community digest: Wiki Loves Women, bridging two gaps at a time; news in brief

Photo by Teemages, CC BY-SA 4.0.

Photo by Teemages, CC BY-SA 4.0.

Malouma is a Mauritanian singer and songwriter who was forced to put her career on hold after being forced into marriage. Many of her songs advocate for women’s rights, so much so that she was censored for part of the 1990s in Mauretania.

Hannah Kudjoe was a Ghanaian dressmaker who later became a political activist. She became of the the major figures calling for for the independence of their country in the 1940s.

Qut el Kouloub was a writer from Egypt who contributed generously to French literature in the first half of the twentieth century. Many critics were confused as to whether her works are fictional or nonfictional historical biographies, and like Malouma, Kouloub used her novels to advocate for women’s rights in Egypt.

The work of Malouma, Kudjoe, and Kouloub all deserve a place in history, but women of their background and experience are not well-represented on the internet. Two of the three women had no article on the English Wikipedia until Wiki Loves Women participants created them; they also helped develop the third one. In addition to that, over 1,300 other pages have been created or developed as part of the project.

Wiki Loves Women is a project that addresses two content gaps on Wikipedia at the same time. Its aim is to encourage both gender and geographical diversity on Wikipedia by adding content about African women. The project is now active in Côte d’Ivoire, Cameroon, Nigeria and Ghana.

“I … realised recently that many articles on Wikipedia are not being read [often],” says Olaniyan Olushola, the project manager of Wiki Loves Women in Nigeria. Olushola used the Wikimedia user group Nigeria Facebook page to promote the content created as part of the Wiki Loves Women events that he leads.

Olushola is trying to “find a way to honor Nigerian women by bridging gender inequalities and reducing systemic bias on Wikipedia.” He was introduced to Wikipedia and mentored by a woman: Isla Haddow-Flood, a co-founder of Wiki Loves Women.

Together with Florence Devouard, Haddow-Flood worked on developing the idea of a project that could help increase the presence of African women on Wikipedia. “After working together on Kumusha Takes Wiki and Wiki Loves Africa, it was apparent that the content gap relating to women was a real issue,” Devouard and Haddow-Flood wrote in an email to us. They continued:

With less than 20% of (all) Wikipedia contributors being female, the global community has long acknowledged the gender gap as a problem. But in sub-Saharan Africa, when combined with the contributor gap—only 25% of edits to subjects about the Sub-Saharan region come from within the region—the lack of information about women forms an abyss.

 
Wiki Loves Women kicked off in January 2016 with a writing contest that was held as part of Wikipedia’s fifteenth anniversary. Several partners, including the German cultural association Goethe-Institut, and four teams in different African countries joined the initiative. So far, participants of the project have uploaded over 1,000 photos to Wikimedia Commons, the free media repository, in addition to editing and creating a similar number of articles on Wikipedia.

On International Women’s Day in March, Wiki Loves Women will hold a translate-a-thon, an editing event to translate Wikipedia articles about women in different languages. The organizers emphasize that everyone is welcome to join.

“It is time for the people of Africa to tell their own stories, change their narrative, shake up the global stereotypes, and share information about what they value and find interesting and important in the world,” say Haddow-Flood and Devouard.

In brief

Wikimania updates: Scholarship applications for Wikimania 2017, which is being held in Montréal, Canada on 11–13 August, are now being accepted. The deadline is on 20 February 2017 at 23:59 UTC. More information is available on the scholarships page and on the FAQ page of the event. Moreover, the steering committee of Wikimania, has decided to explore Cape Town in South Africa as a host for Wikimania 2018. A final decision will be made by spring 2017.

Three billion edits: This week, the total edit count on all Wikimedia projects reached 3,000,000,000. Near the same time, the WikiSpecies community celebrated creating their 500,000th page. The entry is about Pseudocalotes drogon and was created by Wikimedian Burmeister.

Wikimedia developer summit 2017: Last week, many Wikimedia technical contributors, third-party developers, users of MediaWiki and Wikimedia APIs gathered at Golden Gate Club in San Francisco for the Wikimedia developer summit 2017. The event lasted for 2 days where the attendees discussed a list of main topics selected by the community.

Donating data to Wikidata: Wikimedia Germany have published a tutorial video about Wikidata, the collaboratively-edited knowledge base. The short video explores WikiData and how contributing to the website works.

2016 on the Arabic Wikipedia: Mohamed Badaren, an editor and administrator on the Arabic Wikipedia has created a video with a summary of the major events in 2016 and their impact on Wikipedia. The video is an adaptation of earlier English-language versions, Edit 2014 and Edit 2015.

New Signpost published: A new edition of the English Wikipedia’s community-written news journal was published this week. Stories included a “surge” in new administrator promotions on the English Wikipedia; an introspective piece looking at the future of the Signpost; coverage of recent research suggesting that women are not more likely to edit about women; an interview with an active Wikipedian who has been blind since birth; and more.

Kurier: New pieces in the Kurier, the German Wikipedia’s “not necessarily neutral [and] non-encyclopedic” news page, include a three-part look back at the year 2016 and an invitation to a Wiki Loves Music event in Hamburg.

Wiki Project Med Foundation is open for members: Wiki Project Med Foundation is a user group that promotes for better coverage of Medical content on Wikimedia projects. The group is now open for membership applications.

Samir Elsharbaty, Digital Content Intern
Wikimedia Foundation

by Samir Elsharbaty at January 20, 2017 01:29 AM

January 19, 2017

Wikimedia Foundation

Introducing the Wikimedia Resource Center: A hub that helps volunteers find the resources they need

Photo via the Library of the London School of Economics and Political Science, Flickr Commons.

Photo via the Library of the London School of Economics and Political Science, Flickr Commons.

Wikimedia volunteers embrace a wide spectrum of work when it comes to contributing to Wikimedia projects: from reporting a bug, to developing a tool, to requesting a grant to start a new Wikimedia program, and more. As the movement expands to include more affiliates, and more programmatic activities every year, newer Wikimedians are faced with lack of experience in the movement and its various channels to request support.

The response to these questions will lead our new Wikimedian to different pages, from Outreach Wikimedia, to Meta Wikimedia, and MediaWiki.org, as well as connect them to experienced Wikimedians who may be able to help. In a recent user experience research, we learned that the majority of program leaders rely heavily on their personal network and personal contacts to find the information they need.

In order to expand Wikimedia communities’ efforts, however, we need to guarantee open access to resources that support this very important work. The Wikimedia Resource Center is a hub designed in response to this issue: it is a single point of entry for Wikimedians all over the world to access the resources and staff support they need to develop new initiatives, and also expand existing ones.

File:Wikimedia Resource Center - Demo.webm

Demo of the new Wikimedia Resource Center.

 

How does it work?

In the Wikimedia Resource Center you will find resources grouped in nine different tabs, according to the goal the resources serve. Let’s imagine you wanted to start a new Wikimedia program. Under Skills Development tab, you will find evaluation tools, program reports and toolkits, and learning patterns, among other resources. Each tab has an introduction page that describes the area, what each resource means and who can give you direct support in any given topic. Skills Development, together with Grants SupportPrograms SupportProduct DevelopmentGlobal ReachLegal, and Communications, all have the same logic.

Contact and Questions, and Consultations Calendar are slightly different. Under Contact and Questions, you will find frequently asked questions that are searchable by topic. This tab also has a new feature: Ask a question. Wikimedians can use this feature to inquire Wikimedia Foundation staff about any topic that is not covered in the FAQ, and they can do so publicly through the Wikimedia Resource Center, or privately via email. Under Contact and Questions, Wikimedians will also find information about the Emergency response system, and in future developments, also a network of Wikimedians.

Consultations Calendar is a public schedule of upcoming collaborations between Wikimedia Foundation and communities. In this tab, you will also find Wikimedia Community News, that transcludes the content of calendar on Meta Wikimedia main page.

If you get lost, you can always find help on the top right corner on every page.

Help us test!

This release constitutes the alpha version of the Wikimedia Resource Center, and at this stage, user feedback is key to improve its functionality. We want to hear from you! If you have comments about the Wikimedia Resource Center, you can submit feedback publicly, on the Talk Page, or privately, via a survey hosted by a third party, that shouldn’t take you more than 4 minutes to complete.

We started small, only including resources developed by the Wikimedia Foundation, in order to be able to launch an initial version of the hub. In this way, we can learn what works and what needs to be developed further, to include features to better connect Wikimedians. Check the project’s progress on Meta by clicking here.

We hope that this hub will better support Wikimedians’ efforts all over the world, and improve findability of the resources that empower them to do their best work.

María Cruz, Communications and Outreach Project Manager, Community Engagement
Wikimedia Foundation

by María Cruz at January 19, 2017 07:26 PM

Weekly OSM

weeklyOSM 339

01/10/2017-01/16/2017

Karte mit neuen Straßen

Data collected by Red Cross volunteers 1 | © OpenStreetMap Mitwirkende CC-BY-SA 2.0

Mapping

  • Kartitotp shows in her blogpost that the community together with the Mapbox team in Ayacucho took a great step forward to make the 150,000-inhabitant city in the Peruvian Andes, the best mapped city in Latin America. 20 bus routes from 22 public transport companies are now available in OSM.
  • Martin Koppenhöfer raises once again the question why monuments up until today are not clearly distinguished by the two proposed subkeys.

Community

Events

  • Fatouma Harber and Aboul Hassane Cisse, hosted from 7th-9th January 2017 in collaboration with the OSM community of Mali a CartoCamp (mapping party) in Tombouctou.
  • Ulf Treger, in his lecture on maps on the 33C3, takes a look back at the historical development of maps and map projections to date and their geopolitical background.
  • Selene Yang published in a diary of photos from the SotM Latam, 2016 that happened in São Paulo, Brazil.
  • The State of the Map Africa Working Group has starting a logo contest.

Humanitarian OSM

  • In an OSM diary entry, “everyone_sinks_starco” complained about a HOT mapathon in Indonesia. It turned out to be a very effective rant, because various members of HOT Indonesia posted comments to explain what had happened (if you don’t understand Bahasa Indonesia you’ll need to copy and paste the second half of the comments into an online translation tool). To this, user Iyan the Project Manager of Humanitarian OpenStreetMap Team Indonesia clarified and explained about the project.
  • That mapathons can be done well, too, is shown by the Red Cross: After training local mappers, 7000 villages in Liberia, Guinea and Sierra Leone are mapped and GPS traces of 70,000 km of roads & paths are collected by those new volunteers.
  • The blog globalinnovationexchange.org has a very upbeat post published on the topic: Fighting Ebola with Information.

Maps

  • J. Budissin is seeking a volunteer who would setup and operate a Sorbian map as the former admin is not willing to do so anymore. Preferably people from Lusatia and the surroundings can volunteer.

switch2OSM

  • The Chilean tax administration uses OSM maps. (Via osmCL)

Open Data

  • Martin Isenburg reports on rapidlasso.com that now there is open and free LiDAR data in Germany. First, North Rhine-Westphalia and then Thuringia have opened their geoportals for free download of geospatial data at the beginning of 2017. We are full of hope, Martin says, that other federal states will follow their lead. It would simply not make any sense to try to sell this kind of data, as it was shown in England recently.

Software

  • Robot8A tries to convince Telegram developer to use OSM instead of Google Maps. Interesting discussion follows.

Programming

  • Adrien Pavie shows his JS library Pic4Carto which allows to embed geolocated pictures into a website. Right now it supports Flickr, Mapillary and Wikimedia Commons.
  • Karlos shows the newest changes of OSM go, e.g. built-in 3D-models of benches or wind turbines and first impressions from the London tube.

Releases

Software Version Release date Comment
Locus Map Free * 3.21.1 2017-01-10 Bugfix release.
Mapbox GL JS v0.31.0 2017-01-10 One new feature and two bugfixes.
Mapillary iOS * 4.5.12 2017-01-10 Minor fixes.
OSRM 5.5.3 2017-01-11 Two enhancements and three bugfixes.
Naviki iOS;* 3.53 2017-01-12 Supporting Apple Watch.
OSM Contributor 3.0.1 2017-01-12 Bugfix release.
QGIS 2.18.3 2017-01-13 No info.
libosmium 2.11.0 2017-01-14 Many changes, please read release info.
Traccar Client Android 4.0 2017-01-14 No info.
pyosmium 2.11 2017-01-15 Use current libosmium.

Provided by the OSM Software Watchlist.

(*) unfree software. See: freesoftware.

Did you know …

  • … Franz-Benjamin Mocnik’s visualizations on OpenStreetMap changeset and wiki tags?
  • TorFlow? It shows the traffic between the individual nodes of Tor in real time.

OSM in the media

  • The MVV, the local traffic company of Munich, will soon launch (automatic translation) a new service based on OpenStreetMap to show arrivals and delays of local trains. The MVV notes that OpenStreetMap data is not only free but also more current than data from HERE.
  • Federal Agency for Civic Education published an article how OpenStreetMap could be used for educational purposes in a public school. (automatic translation)
  • The Herald, in Zimbabwe writes about the importance of collaborative mapping initiatives, such a Missing Maps to help build resilience and better humanitarian response.

Other “geo” things

  • Examples of using OpenStreetMap data and Mapzen tools in news companies.
  • The QuickMapServices (we reported earlier) now contains more than 555 services.
  • Mashable presents jeans from Spinali Design that helps to navigate. We hope for the sake of your safety that only OpenStreetMap data is being used.

Upcoming Events

Where What When Country
Tokyo 東京!街歩き!マッピングパーティ:第4回 根津神社 01/21/2017 japan
Manila 【MapAm❤re】OSM Workshop Series 8/8, San Juan 01/23/2017 philippines
Bremen Bremer Mappertreffen 01/23/2017 germany
Graz Stammtisch Graz 01/23/2017 austria
Nottingham Nottingham Pub Meetup 01/24/2017 uk
Dresden Stammtisch 02/02/2017 germany
Lyon Stand OSM Salon Primevère 02/03/2017-02/05/2017 france
Brussels FOSDEM 2017 02/04/2017-02/05/2017 belgium
Genoa OSMit2017 02/08/2017-02/11/2017 italy
Cardiff OpenDataCamp UK 02/25/2017-02/26/2017 wales
Passau FOSSGIS 2017 03/22/2017-03/25/2017 germany
Avignon State of the Map France 2017 06/02/2017-06/04/2017 france
Aizu-wakamatsu Shi State of the Map 2017 08/18/2017-08/20/2017 japan
Buenos Aires FOSS4G+SOTM Argentina 2017 10/23/2017-10/28/2017 argentina

Note: If you like to see your event here, please put it into the calendar. Only data which is there, will appear in weeklyOSM. Please check your event in our public calendar preview and correct it, where appropiate..

This weeklyOSM was produced by Peda, Polyglot, Rogehm, SeleneYang, SomeoneElse, TheFive, YoViajo, derFred, jinalfoflia, keithonearth, vsandre.

by weeklyteam at January 19, 2017 03:44 PM

January 18, 2017

Wikimedia Foundation

Why I wrote 100 articles in 100 days about inspiring Jewish women

Ester Rada, an Israeli musician who now has an article on the Spanish Wikipedia. Photo by Oren Rozen, CC BY-SA 3.0..

Ester Rada, an Israeli musician who now has an article on the Spanish Wikipedia. Photo by Oren Rozen, CC BY-SA 3.0.

Seven months ago, I was looking for a new job.

With little else to do after applying for a new one, I browsed Facebook, where I saw Wikimedian friends posting with #100wikidays.

I quickly discovered that the hashtag referred to a challenge undertaken by Wikipedians to write one new article each day for a hundred days. It was the brainchild of my Bulgarian colleague Spiritia, who only a month earlier was the runner-up Wikipedian of the year for coming up with it.

To release the stress of the job hunt, I decided to do it—but my way, by writing articles about Jewish women on the Spanish, Portuguese, English, and Ladino Wikipedias.

I started with a woman from Venezuela, the country I was born in: Margot Benacerraf, a movie director of Moroccan-Jewish origin who received the Cannes Prize in 1959. Who could imagine that in the late 50s, a young woman from a country little-known to many would capture the attention of critics at the Cannes Festival? Benacerraf is now considered the mother of the Venezuelan cinema, founder of the National Cinematheque.

Another woman worthy of mentioning is Houda Nonoo, who served as ambassador of Bahrain to the US from 2008 to 2013. She is the third woman to be an ambassador of Bahrain and the first Jew named as an ambassador from any country in the Arab World.

During these 100 days, I spoke much about these women, telling their stories to everyone who asked about them. One day, I came across an article about a semi-legendary queen in Ethiopia that ended the Axum dynasty, crowned herself, and set the churches of Abyssinia on fire. I asked an Ethiopian friend about her, who immediately replied “Esato? She burned Ethiopia, killed the princes, and took all their gold!”

I don’t know if it’s a legend or not, and neither do historians, but my friend sounded very excited to tell me about her. And now we have an article about Gudit!

On every one of my hundred days, I spent time on the internet looking for another notable Jewish woman whose life would catch my attention. Some were so impressive that I needed to create them on the spot.

One was Caterina Tarongí, who was burned alive by the Spanish Inquisition. Her words to her brother on the way to the auto-da-fé have survived through folk songs and expressions. Another was Raquel Líberman, born in Poland, who declared in publicly denouncing and breaking a Argentine human trafficking network that specialized in Jewish women that “I can only die once, I won’t withdraw the complaint.” The organized network was comprised of over 30,000 women over a seventy-year period.

At some point, I started searching for interesting Jewish women in other language Wikipedias, looking to spread awareness of these people across national and linguistic borders. The one that interested me most was Violeta Yakova, a Bulgarian resistance fighter during the Second World War. Along with two fellow Jews, Yakova killed well-known anti-semites and Nazi informers. The only article about her was in Bulgarian, but after I translated her article into English, other participants in the challenge translated it into seven more! When things like this happen, you get this feeling of accomplishment, of not just contributing to the expansion of free knowledge, but also of engaging other people do it with you as well. It’s a win-win situation.

#100wikidays also gave me the opportunity to interact with other Wikipedians, many of whom I had never met before. Out of these, one of the most remarkable colleagues I befriended in this experience has been Mervat Salman. Mervat lives in Amman, Jordan; I live in Jerusalem, a religious Jew who became an Israeli citizen at Ben-Gurion Airport.

At first sight, one would only focus on our differences. But there’s more: we both work in the IT industry, we both like Middle Eastern food and music and—the most important thing—we both believe in freedom of knowledge and the need to make it accessible for everyone.

After I finished the challenge, I was exhausted. #100wikidays took up a good deal of my time over those one hundred days, but it was satisfying and completely worth the effort.

But I couldn’t rest for long. Only days later, Mervat started asking me if I wanted to take on the #100WikiCommonsDays challenge—like #100wikidays but with pictures. Since I didn’t start immediately, she asked again, and again … until I started uploading photos to Commons. And here I am, halfway through it!

Inspiration is essential in life. I was inspired by all these 100 women, and I hope others will be too.

Maor Malul, Wikipedian

by Maor Malul at January 18, 2017 11:18 PM

Greg Sabino Mullane

MediaWiki extension.json change in 1.25

I recently released a new version of the MediaWiki "Request Tracker" extension, which provides a nice interface to your RequestTracker instance, allowing you to view the tickets right inside of your wiki. There are two major changes I want to point out. First, the name has changed from "RT" to "RequestTracker". Second, it is using the brand-new way of writing MediaWiki extensions, featuring the extension.json file.

The name change rationale is easy to understand: I wanted it to be more intuitive and easier to find. A search for "RT" on mediawiki.org ends up finding references to the WikiMedia RequestTracker system, while a search for "RequestTracker" finds the new extension right away. Also, the name was too short and failed to indicate to people what it was. The "rt" tag used by the extension stays the same. However, to produce a table showing all open tickets for user 'alois', you still write:

<rt u='alois'></rt>

The other major change was to modernize it. As of version 1.25 of MediaWiki, extensions are encouraged to use a new system to register themselves with MediaWiki. Previously, an extension would have a PHP file named after the extension that was responsible for doing the registration and setup—usually by mucking with global variables! There was no way for MediaWiki to figure out what the extension was going to do without parsing the entire file, and thereby activating the extension. The new method relies on a standard JSON file called extension.json. Thus, in the RequestTracker extension, the file RequestTracker.php has been replaced with the much smaller and simpler extension.json file.

Before going further, it should be pointed out that this is a big change for extensions, and was not without controversy. However, as of MediaWiki 1.25 it is the new standard for extensions, and I think the project is better for it. The old way will continue to be supported, but extension authors should be using extension.json for new extensions, and converting existing ones over. As an aside, this is another indication that JSON has won the data format war. Sorry, XML, you were too big and bloated. Nice try YAML, but you were a little *too* free-form. JSON isn't perfect, but it is the best solution of its kind. For further evidence, see Postgres, which now has outstanding support for JSON and JSONB. I added support for YAML output to EXPLAIN in Postgres some years back, but nobody (including me!) was excited enough about YAML to do more than that with it. :)

The extension.json file asks you to fill in some standard metadata fields about the extension, which are then used by MediaWiki to register and set up the extension. Another advantage of doing it this way is that you no longer need to add a bunch of ugly include_once() function calls to your LocalSettings.php file. Now, you simply call the name of the extension as an argument to the wfLoadExtension() function. You can even load multiple extensions at once with wfLoadExtensions():

## Old way:
require_once("$IP/extensions/RequestTracker/RequestTracker.php");
$wgRequestTrackerURL = 'https://rt.endpoint.com/Ticket/Display.html?id';

## New way:
wfLoadExtension( 'RequestTracker' );
$wgRequestTrackerURL = 'https://rt.endpoint.com/Ticket/Display.html?id';

## Or even load three extensions at once:
wfLoadExtensions( array( 'RequestTracker', 'Balloons', 'WikiEditor' ) );
$wgRequestTrackerURL = 'https://rt.endpoint.com/Ticket/Display.html?id';

Note that configuration changes specific to the extension still must be defined in the LocalSettings.php file.

So what should go into the extension.json file? The extension development documentation has some suggested fields, and you can also view the canonical extension.json schema. Let's take a quick look at the RequestTracker/extension.json file. Don't worry, it's not too long.

{
    "manifest_version": 1,
    "name": "RequestTracker",
    "type": "parserhook",
    "author": [
        "Greg Sabino Mullane"
    ],
    "version": "2.0",
    "url": "https://www.mediawiki.org/wiki/Extension:RequestTracker",
    "descriptionmsg": "rt-desc",
    "license-name": "PostgreSQL",
    "requires" : {
        "MediaWiki": ">= 1.25.0"
    },
    "AutoloadClasses": {
        "RequestTracker": "RequestTracker_body.php"
    },
    "Hooks": {
        "ParserFirstCallInit" : [
            "RequestTracker::wfRequestTrackerParserInit"
        ]
    },
    "MessagesDirs": {
        "RequestTracker": [
            "i18n"
            ]
    },
    "config": {
        "RequestTracker_URL": "http://rt.example.com/Ticket/Display.html?id",
        "RequestTracker_DBconn": "user=rt dbname=rt",
        "RequestTracker_Formats": [],
        "RequestTracker_Cachepage": 0,
        "RequestTracker_Useballoons": 1,
        "RequestTracker_Active": 1,
        "RequestTracker_Sortable": 1,
        "RequestTracker_TIMEFORMAT_LASTUPDATED": "FMHH:MI AM FMMonth DD, YYYY",
        "RequestTracker_TIMEFORMAT_LASTUPDATED2": "FMMonth DD, YYYY",
        "RequestTracker_TIMEFORMAT_CREATED": "FMHH:MI AM FMMonth DD, YYYY",
        "RequestTracker_TIMEFORMAT_CREATED2": "FMMonth DD, YYYY",
        "RequestTracker_TIMEFORMAT_RESOLVED": "FMHH:MI AM FMMonth DD, YYYY",
        "RequestTracker_TIMEFORMAT_RESOLVED2": "FMMonth DD, YYYY",
        "RequestTracker_TIMEFORMAT_NOW": "FMHH:MI AM FMMonth DD, YYYY"
    }
}

The first field in the file is manifest_version, and simply indicates the extension.json schema version. Right now it is marked as required, and I figure it does no harm to throw it in there. The name field should be self-explanatory, and should match your CamelCase extension name, which will also be the subdirectory where your extension will live under the extensions/ directory. The type field simply tells what kind of extension this is, and is mostly used to determine which section of the Special:Version page an extension will appear under. The author is also self-explanatory, but note that this is a JSON array, allowing for multiple items if needed. The version and url are highly recommended. For the license, I chose the dirt-simple PostgreSQL license, whose only fault is its name. The descriptionmsg is what will appear as the description of the extension on the Special:Version page. As it is a user-facing text, it is subject to internationalization, and thus rt-desc is converted to your current language by looking up the language file inside of the extension's i18n directory.

The requires field only supports a "MediaWiki" subkey at the moment. In this case, I have it set to require at least version 1.25 of MediaWiki - as anything lower will not even be able to read this file! The AutoloadClasses key is the new way of loading code needed by the extension. As before, this should be stored in a php file with the name of the extension, an underscore, and the word "body" (e.g. RequestTracker_body.php). This file contains all of the functions that perform the actual work of the extension.

The Hooks field is one of the big advantages of the new extension.json format. Rather than worrying about modifying global variables, you can simply let MediaWiki know what functions are associated with which hooks. In the case of RequestTracker, we need to do some magic whenever a <rt> tag is encountered. To that end, we need to instruct the parser that we will be handling any <rt> tags it encounters, and also tell it what to do when it finds them. Those details are inside the wfRequestTrackerParserInit function:

function wfRequestTrackerParserInit( Parser $parser ) {

    $parser->setHook( 'rt', 'RequestTracker::wfRequestTrackerRender' );

    return true;
}

The config field provides a list of all user-configurable variables used by the extension, along with their default values.

The MessagesDirs field tells MediaWiki where to find your localization files. This should always be in the standard place, the i18n directory. Inside that directory are localization files, one for each language, as well as a special file named qqq.json, which gives information about each message string as a guide to translators. The language files are of the format "xxx.json", where "xxx" is the language code. For example, RequestTracker/i18n/en.json contains English versions of all the messages used by the extension. The i18n files look like this:

$ cat en.json
{
  "rt-desc"       : "Fancy interface to RequestTracker using <code>&lt;rt&gt;</code> tag",
  "rt-inactive"   : "The RequestTracker extension is not active",
  "rt-badcontent" : "Invalid content args: must be a simple word. You tried: <b>$1</b>",
  "rt-badquery"   : "The RequestTracker extension encountered an error when talking to the RequestTracker database",
  "rt-badlimit"   : "Invalid LIMIT (l) arg: must be a number. You tried: <b>$1</b>",
  "rt-badorderby" : "Invalid ORDER BY (ob) arg: must be a standard field (see documentation). You tried: <b>$1</b>",
  "rt-badstatus"  : "Invalid status (s) arg: must be a standard field (see documentation). You tried: <b>$1</b>",
  "rt-badcfield"  : "Invalid custom field arg: must be a simple word. You tried: <b>$1</b>",
  "rt-badqueue"   : "Invalid queue (q) arg: must be a simple word. You tried: <b>$1</b>",
  "rt-badowner"   : "Invalid owner (o) arg: must be a valud username. You tried: <b>$1</b>",
  "rt-nomatches"  : "No matching RequestTracker tickets were found"
}

$ cat fr.json
{
  "@metadata": {
     "authors": [
         "Josh Tolley"
      ]
  },
  "rt-desc"       : "Interface sophistiquée de RequestTracker avec l'élement <code>&lt;rt&gt;</code>.",
  "rt-inactive"   : "Le module RequestTracker n'est pas actif.",
  "rt-badcontent" : "Paramètre de contenu « $1 » est invalide: cela doit être un mot simple.",
  "rt-badquery"   : "Le module RequestTracker ne peut pas contacter sa base de données.",
  "rt-badlimit"   : "Paramètre à LIMIT (l) « $1 » est invalide: cela doit être un nombre entier.",
  "rt-badorderby  : "Paramètre à ORDER BY (ob) « $1 » est invalide: cela doit être un champs standard. Voir le manuel utilisateur.",
  "rt-badstatus"  : "Paramètre de status (s) « $1 » est invalide: cela doit être un champs standard. Voir le manuel utilisateur.",
  "rt-badcfield"  : "Paramètre de champs personalisé « $1 » est invalide: cela doit être un mot simple.",
  "rt-badqueue"   : "Paramètre de queue (q) « $1 » est invalide: cela doit être un mot simple.",
  "rt-badowner"   : "Paramètre de propriétaire (o) « $1 » est invalide: cela doit être un mot simple.",
  "rt-nomatches"  : "Aucun ticket trouvé"
}

One other small change I made to the extension was to allow both ticket numbers and queue names to be used inside of the tag. To view a specific ticket, one was always able to do this:

<rt>6567</rt>

This would produce the text "RT #6567", with information on the ticket available on mouseover, and hyperlinked to the ticket inside of RT. However, I often found myself using this extension to view all the open tickets in a certain queue like this:

<rt q="dyson"></rt>

It seems easier to simply add the queue name inside the tags, so in this new version one can simply do this:

<rt>dyson</rt>

If you are running MediaWiki 1.25 or better, try out the new RequestTracker extension! If you are stuck on an older version, use the RT extension and upgrade as soon as you can. :)

by Greg Sabino Mullane (noreply@blogger.com) at January 18, 2017 03:41 AM

Broken wikis due to PHP and MediaWiki "namespace" conflicts

I was recently tasked with resurrecting an ancient wiki. In this case, a wiki last updated in 2005, running MediaWiki version 1.5.2, and that needed to get transformed to something more modern (in this case, version 1.25.3). The old settings and extensions were not important, but we did want to preserve any content that was made.

The items available to me were a tarball of the mediawiki directory (including the LocalSettings.php file), and a MySQL dump of the wiki database. To import the items to the new wiki (which already had been created and was gathering content), an XML dump needed to be generated. MediaWiki has two simple command-line scripts to export and import your wiki, named dumpBackup.php and importDump.php. So it was simply a matter of getting the wiki up and running enough to run dumpBackup.php.

My first thought was to simply bring the wiki up as it was - all the files were in place, after all, and specifically designed to read the old version of the schema. (Because the database scheme changes over time, newer MediaWikis cannot run against older database dumps.) So I unpacked the MediaWiki directory, and prepared to resurrect the database.

Rather than MySQL, the distro I was using defaulted to using the freer and arguably better MariaDB, which installed painlessly.

## Create a quick dummy database:
$ echo 'create database footest' | sudo mysql

## Install the 1.5.2 MediaWiki database into it:
$ cat mysql-acme-wiki.sql | sudo mysql footest

## Sanity test as the output of the above commands is very minimal:
echo 'select count(*) from revision' | sudo mysql footest
count(*)
727977

Success! The MariaDB instance was easily able to parse and load the old MySQL file. The next step was to unpack the old 1.5.2 mediawiki directory into Apache's docroot, adjust the LocalSettings.php file to point to the newly created database, and try and access the wiki. Once all that was done, however, both the browser and the command-line scripts spat out the same error:

Parse error: syntax error, unexpected 'Namespace' (T_NAMESPACE), 
  expecting identifier (T_STRING) in 
  /var/www/html/wiki/includes/Namespace.php on line 52

What is this about? Turns out that some years ago, someone added a class to MediaWiki with the terrible name of "Namespace". Years later, PHP finally caved to user demands and added some non-optimal support for namespaces, which means that (surprise), "namespace" is now a reserved word. In short, older versions of MediaWiki cannot run with modern (5.3.0 or greater) versions of PHP. Amusingly, a web search for this error on DuckDuckGo revealed not only many people asking about this error and/or offering solutions, but many results were actual wikis that are currently not working! Thus, their wiki was working fine one moment, and then PHP was (probably automatically) upgraded, and now the wiki is dead. But DuckDuckGo is happy to show you the wiki and its now-single page of output, the error above. :)

There are three groups to blame for this sad situation, as well as three obvious solutions to the problem. The first group to share the blame, and the most culpable, is the MediaWiki developers who chose the word "Namespace" as a class name. As PHP has always had very non-existent/poor support for packages, namespaces, and scoping, it is vital that all your PHP variables, class names, etc. are as unique as possible. To that end, the name of the class was changed at some point to "MWNamespace" - but the damage has been done. The second group to share the blame is the PHP developers, both for not having namespace support for so long, and for making it into a reserved word full knowing that one of the poster children for "mature" PHP apps, MediaWiki, was using "namespace". Still, we cannot blame them too much for picking what is a pretty obvious word choice. The third group to blame is the owners of all those wikis out there that are suffering that syntax error. They ought to be repairing their wikis. The fixes are pretty simple, which leads us to the three solutions to the problem.


MediaWiki's cool install image

The quickest (and arguably worst) solution is to downgrade PHP to something older than 5.3. At that point, the wiki will probably work again. Unless it's a museum (static) wiki, and you do not intend to upgrade anything on the server ever again, this solution will not work long term. The second solution is to upgrade your MediaWiki! The upgrade process is actually very robust and works well even for very old versions of MediaWiki (as we shall see below). The third solution is to make some quick edits to the code to replace all uses of "Namespace" with "MWNamespace". Not a good solution, but ideal when you just need to get the wiki up and running. Thus, it's the solution I tried for the original problem.

However, once I solved the Namespace problem by renaming to MWNamespace, some other problems popped up. I will not run through them here - although they were small and quickly solved, it began to feel like a neverending whack-a-mole game, and I decided to cut the Gordian knot with a completely different approach.

As mentioned, MediaWiki has an upgrade process, which means that you can install the software and it will, in theory, transform your database schema and data to the new version. However, version 1.5 of MediaWiki was released in October 2005, almost exactly 10 years ago from the current release (1.25.3 as of this writing). Ten years is a really, really long time on the Internet. Could MediaWiki really convert something that old? (spoilers: yes!). Only one way to find out. First, I prepared the old database for the upgrade. Note that all of this was done on a private local machine where security was not an issue.

## As before, install mariadb and import into the 'footest' database
$ echo 'create database footest' | sudo mysql test
$ cat mysql-acme-wiki.sql | sudo mysql footest
$ echo "set password for 'root'@'localhost' = password('foobar')" | sudo mysql test

Next, I grabbed the latest version of MediaWiki, verified it, put it in place, and started up the webserver:

$ wget http://releases.wikimedia.org/mediawiki/1.25/mediawiki-1.25.3.tar.gz
$ wget http://releases.wikimedia.org/mediawiki/1.25/mediawiki-1.25.3.tar.gz.sig

$ gpg --verify mediawiki-1.25.3.tar.gz.sig 
gpg: assuming signed data in `mediawiki-1.25.3.tar.gz'
gpg: Signature made Fri 16 Oct 2015 01:09:35 PM EDT using RSA key ID 23107F8A
gpg: Good signature from "Chad Horohoe "
gpg:                 aka "keybase.io/demon "
gpg:                 aka "Chad Horohoe (Personal e-mail) "
gpg:                 aka "Chad Horohoe (Alias for existing email) "
## Chad's cool. Ignore the below.
gpg: WARNING: This key is not certified with a trusted signature!
gpg:          There is no indication that the signature belongs to the owner.
Primary key fingerprint: 41B2 ABE8 17AD D3E5 2BDA  946F 72BC 1C5D 2310 7F8A

$ tar xvfz mediawiki-1.25.3.tar.gz
$ mv mediawiki-1.25.3 /var/www/html/
$ cd /var/www/html/mediawiki-1.25.3
## Because "composer" is a really terrible idea:
$ git clone https://gerrit.wikimedia.org/r/p/mediawiki/vendor.git 
$ sudo service httpd start

Now, we can call up the web page to install MediaWiki.

  • Visit http://localhost/mediawiki-1.25.3, see the familiar yellow flower
  • Click "set up the wiki"
  • Click next until you find "Database name", and set to "footest"
  • Set the "Database password:" to "foobar"
  • Aha! Looks what shows up: "Upgrade existing installation" and "There are MediaWiki tables in this database. To upgrade them to MediaWiki 1.25.3, click Continue"

It worked! Next messages are: "Upgrade complete. You can now start using your wiki. If you want to regenerate your LocalSettings.php file, click the button below. This is not recommended unless you are having problems with your wiki." That message is a little misleading. You almost certainly *do* want to generate a new LocalSettings.php file when doing an upgrade like this. So say yes, leave the database choices as they are, and name your wiki something easily greppable like "ABCD". Create an admin account, save the generated LocalSettings.php file, and move it to your mediawiki directory.

At this point, we can do what we came here for: generate a XML dump of the wiki content in the database, so we can import it somewhere else. We only wanted the actual content, and did not want to worry about the history of the pages, so the command was:

$ php maintenance/dumpBackup.php --current > acme.wiki.2005.xml

It ran without a hitch. However, close examination showed that it had an amazing amount of unwanted stuff from the "MediaWiki:" namespace. While there are probably some clever solutions that could be devised to cut them out of the XML file (either on export, import, or in between), sometimes quick beats clever, and I simply opened the file in an editor and removed all the "page" sections with a title beginning with "MediaWiki:". Finally, the file was shipped to the production wiki running 1.25.3, and the old content was added in a snap:

$ php maintenance/importDump.php acme.wiki.2005.xml

The script will recommend rebuilding the "Recent changes" page by running rebuildrecentchanges.php (can we get consistentCaps please MW devs?). However, this data is at least 10 years old, and Recent changes only goes back 90 days by default in version 1.25.3 (and even shorter in previous versions). So, one final step:

## 20 years should be sufficient
$ echo '$wgRCMAxAge = 20 * 365 * 24 * 3600;' >> LocalSettings.php
$ php maintenance/rebuildrecentchanges.php

Voila! All of the data from this ancient wiki is now in place on a modern wiki!

by Greg Sabino Mullane (noreply@blogger.com) at January 18, 2017 03:23 AM

January 17, 2017

Erik Zachte

Browse winning Wiki Loves Monuments images offline

wlm_2016_in_aks_the_reflection_taj_mahal

Click to show full size (1136×640), e.g. for iPhone 5

 

The pages on Wikimedia Commons which list the winners of the yearly contests [1] contain a feature ‘Watch as Slideshow!’. Works great.

However, wouldn’t it be nice if you could also show these images offline (outside a browser), annotated and resized for minimal footprint?

Most end-of-year vacations I do a hobby project for Wikipedia. This time I worked on a script [2] [3] to make the above happen. The script does the following:

  • Download all images from Wiki Loves Monuments winners pages [1]
  • Collect image, author and license info for each image on those winners pages
  • or if not available there, collect these meta data from the upload pages on Commons
  • Resize the images so they are exactly the required size
  • Annotate the image unobtrusively in a matching font size:
    contest year, country, title, author, license
wlm-annotations

Font size used for 2560×1600 image

 

  • Prefix the downloaded image for super easy filtering on year and/or countrywlm-winners-file-list-detail


I pre-rendered several sets with common image sizes, ready for download. You can request an extra set for other common screen sizes [4] [5]:

wlm_download_folder


For instance the 1920×1080 set is ideal for HDTV (e.g. for Appl
e TV screensaver) or large iPhones. On TV the texts are readable by itself, on phone some manual zooming is needed (but unobtrusiveness is key).

[1] 2010 2011 2012 2013 2014 2015 2016
[2] The script has been tested on Windows 10.
Prerequisites: curl and ImageMagicks convert (in same folder).
[3] I am actually already rewriting the script, separating it into two scripts, to make it more modular and more generally applicable. First script will extract information from WLM/WLE (WLA?) winners pages and image upload pages, and generate a csv file. Second script will read this csv, download images, resize and annotate them. I will announce the git url here when done.
[4] 4K is a bit too large for easy upload. I may do that later when the script can also run on WMF servers.
[5] Current sets are optimal for e.g. HDTV and new iPhones (again, others may follow):
1920×1080 HDTV and iPhone 6+/7+
1334×750 iPhone 6/6s/7
1136×640 iPhone 5/5s 

by Erik at January 17, 2017 12:46 PM

Gerard Meijssen

#Wikimedia - What is our mission

Many Wikipedians have a problem with Wikidata. It is very much cultural. One argument is that Wikidata does not comply with their policies and therefore cannot be used. A case in point is "notability", Wikidata knows about much more and how can that all be good?

To be honest, Wikidata is immature and it needs to be a lot better. When a Wikipedia community does not want to incorporate data from Wikidata at this point, fine. Let us find what it takes to do so in the future. Let us work on approaches that are possible now and add value to everyone.

Many of the arguments that are used show a lack of awareness of Wikipedia's own history. There are no reminders to the times when it was good to be "bold". It is forgotten that content should be allowed to improve over time and, this is still true for all of the Wikimedia content.

The problem is that every Wikidata provides a service to every Wikimedia project and as a consequence there are parts of a project where Wikidata will never comply with its policies. Arguably, all the policies of all the projects including Wikidata service what the Wikimedia Foundation is about it is to provide "every single person on the planet is given free access to the sum of all human knowledge".  When the argument is framed in this way, the question becomes a different one; it becomes how can we benefit from each other and how can we strengthen the quality of each others offerings.

Wikidata got a flying start when it replaced all the interwiki links. When all the wiki links and red links are associated with Wikidata links, it will allow for new ways to improve the consistency of Wikipedia. The problem with culture is that it is resistant to change. So when the entrenched practice is that they do not want Wikidata, let's give them the benefits of Wikidata. In a "phabricator" thingie I tried to describe it.

The proposal is for both red links and wiki links to be associated with Wikidata items. It will make it easier to use the data tools associated with Wikidata to verify, curate and improve the Wikipedia content. Obviously every link could have an associated statement. When more and more Wikipedia links are associated with statements Wikidata improves but as part of the process, these links are verified and errors will be removed.

The nice thing is that the proposal allows for it to be "opt in". The old school Wikipedians do not have to notice. It will only be for those who understand the premise of using Wikidata to improve content. In the end it will allow Wikidata and even Wikipedia to mature. It will bring another way to look at quality and it will ensure that all the content of the Wikimedia Foundation will get better integrated and be of a higher quality.
Thanks,
      GerardM

by Gerard Meijssen (noreply@blogger.com) at January 17, 2017 09:25 AM

Wikimedia Foundation

Wikipedia is built on the public domain

Image by the US Department of Agriculture, public domain/CC0.

Image by the US Department of Agriculture, public domain/CC0.

Wikipedia is built to be shared and remixed. This is possible, in part, thanks to the incredible amount of material that is available in the public domain. The public domain refers to a wide range of creations that are not restricted by copyright, and can be used freely by anyone. These works can be copied, translated, or remixed, so the public domain provides a crucial foundation for new creative works. On Wikipedia, some articles are based on text from older public domain encyclopedias or include images no longer protected by copyright. People regularly use public domain material to bring educational content to life and create interesting new ways to share it further.

There are three basic ways that material commonly enters the public domain.

First, when you think of the public domain, you may think of the very old creations that have expired copyright. In the United States and many other countries, copyright lasts for the life of the author plus seventy years. Works published before 1923 are in the public domain, but the rest are governed by complex copyright rules. Peter B. Hirtle of Cornell University created a helpful chart to determine when the copyright terms for various types of works will expire in the U.S.. Due to the copyright term extension in the 1976 Copyright Act and later amendments, published works from the United States will not start entering the public domain until 2019. In places outside of the U.S., copyright terms expire after shorter terms on January 1, celebrated annually as public domain day.

Second, a valuable contributor to the public domain is the U.S. federal government. Works created by the U.S. government are in the public domain as a matter of law. This means that government websites may provide a rich source of freely usable photographs and other material. A primary purpose of copyright is to promote creation by rewarding people with exclusive rights, but the government does not need this sort of incentive. Government works are already funded directly by taxpayers, and should belong to the public. Putting the government’s creations in the public domain allows everyone to benefit from the government’s work.

Third, some authors choose to dedicate their creations to the public domain. Tools like Creative Commons Zero (CC0) allow people to mark works that the public can freely used without restrictions or conditions. CC0 is used for some highly creative works, like the photographs on Unsplash. Other creators may wish release their works freely, but still maintain some copyright with minimal conditions attached. These users may adopt a license like Creative Commons Attribution Share-Alike (CC BY-SA) to require other users to provide credit and re-license their works. Most of the photographs on Wikimedia Commons and all the articles on Wikipedia are freely available under CC BY-SA. While these works still have copyright and are not completely in the public domain, they can still be shared and remixed freely alongside public domain material.

In the coming years, legislators in many countries will consider writing new copyright rules to adapt to changes in technology and the economy. One important consideration is how these proposals will protect the public domain to provide room for new creations. The European Parliament has already begun considering a proposed change to the Copyright Directive, including concerning new rights that would make the public domain less accessible to the public. As copyright terms have been extended over the past few decades, works from the 1960s remain expensive and inaccessible that would otherwise be free of copyright. As we consider changing copyright rules, we should remember that everyone, including countless creators, will benefit from a rich and vibrant public domain.

Stephen LaPorte, Senior Legal Counsel
Wikimedia Foundation

Interested in getting more involved? Learn more about the Wikimedia Foundation’s position on copyright, and join the public policy mailing list to discuss how Wikimedia can continue to protect the public domain.

by Stephen LaPorte at January 17, 2017 12:25 AM

January 16, 2017

Wiki Education Foundation

The Roundup: Serious Business

It can be tricky to find publicly accessible, objective information about business-related subjects. It’s more common for there to be monetary incentives to advocate, promote, omit, or underplay particular aspects, points of view, or examples. The concepts can also be complex, weaving together theory, history, law, and a variety of opinions. Effectively writing about business on Wikipedia thus requires neutrality, but also great care in selecting sources and the ability to summarize the best information about a topic. It’s for these reasons that students can make particularly valuable contributions to business topics on Wikipedia. They arrive at the subject without the burden of a conflict of interest that a professional may have, they have access to high-quality sources, and have an expert to guide them on their way.

Students in Amy Carleton’s Advanced Writing in the Business Administration Professions course at Northeastern University made several such contributions.

One student contributed to the article on corporate social responsibility, adding information from academic research on the effects of the business model on things like employee turnover and customer relations.

Another student created the article about the investigation of Apple’s transfer pricing arrangements with Ireland, a three-year investigation into the tax benefits Apple, Inc. received. The result was the “biggest tax claim ever”, though the decision is being appealed.

Overtime is something that affects millions of workers, and which has been a common topic of labor disputes. Wikipedia has an article about overtime in general, but it’s largely an overview of relevant laws. What had not been covered until a student created the article, are the effects of overtime. Similarly, while Wikipedia covers a wide range of immigration topics, it did not yet cover the international entrepreneur rule, a proposed immigration regulation that would to admit more foreign entrepreneurs into the United States. As with areas where there are common monetary conflicts of interest, controversial subjects like immigration policy are also simultaneously challenging to write and absolutely crucial to write about.

Some of the other topics covered in the class include philanthropreneurs, the globalization of the football transfer market, peer-to-peer transactions, and risk arbitrage.

Contributing well-written, neutral information about challenging but important topics is a valuable public service. If you’re an instructor who may want to participate, Wiki Ed is here to help. We’re a non-profit organization that can provide you with free tools and staff support for you and your students as you have them contribute to public knowledge on Wikipedia for a class assignment. To learn more, head to teach.wikipedia.org or email contact@wikiedu.org.

Photo: Dodge Hall Northeastern University.jpg, User:Piotrus (original) / User:Rhododendrites (derivative), CC BY-SA 3.0, via Wikimedia Commons.

by Ryan McGrady at January 16, 2017 05:07 PM

Semantic MediaWiki

Semantic MediaWiki 2.4.5 released/en

Semantic MediaWiki 2.4.5 released/en


January 16, 2017

Semantic MediaWiki 2.4.5 (SMW 2.4.5) has been released today as a new version of Semantic MediaWiki.

This new version is a minor release and provides bugfixes for the current 2.4 branch of Semantic MediaWiki. Please refer to the help page on installing Semantic MediaWiki to get detailed instructions on how to install or upgrade.

by TranslateBot at January 16, 2017 01:27 PM

Semantic MediaWiki 2.4.5 released

Semantic MediaWiki 2.4.5 released

January 16, 2017

Semantic MediaWiki 2.4.5 (SMW 2.4.5) has been released today as a new version of Semantic MediaWiki.

This new version is a minor release and provides bugfixes for the current 2.4 branch of Semantic MediaWiki. Please refer to the help page on installing Semantic MediaWiki to get detailed instructions on how to install or upgrade.

by Kghbln at January 16, 2017 01:24 PM

User:Legoktm

MediaWiki - powered by Debian

Barring any bugs, the last set of changes to the MediaWiki Debian package for the stretch release landed earlier this month. There are some documentation changes, and updates for changes to other, related packages. One of the other changes is the addition of a "powered by Debian" footer icon (drawn by the amazing Isarra), right next to the default "powered by MediaWiki" one.

Powered by Debian

This will only be added by default to new installs of the MediaWiki package. But existing users can just copy the following code snippet into their LocalSettings.php file (adjust paths as necessary):

# Add a "powered by Debian" footer icon
$wgFooterIcons['poweredby']['debian'] = [
    "src" => "/mediawiki/resources/assets/debian/poweredby_debian_1x.png",
    "url" => "https://www.debian.org/",
    "alt" => "Powered by Debian",
    "srcset" =>
        "/mediawiki/resources/assets/debian/poweredby_debian_1_5x.png 1.5x, " .
        "/mediawiki/resources/assets/debian/poweredby_debian_2x.png 2x",
];

The image files are included in the package itself, or you can grab them from the Git repository. The source SVG is available from Wikimedia Commons.

by legoktm at January 16, 2017 09:18 AM

January 15, 2017

Wikimedia Foundation

Librarians offer the gift of a footnote to celebrate Wikipedia’s birthday: Join #1lib1ref 2017

Photo by Diliff, CC BY-SA 4.0.

Photo by Diliff, CC BY-SA 4.0.

Wikipedia has just turned 16, at a time when the need for accurate, reliable information is greater than ever. In a world where social media channels are awash with fake news, and unreliable assertions come from every corner, the Wikimedia communities and Wikipedia in particular have offered a space for that free, accessible and reliable information to be aggregated and shared with the broader world.

Making sure that the public, our patrons, reach the best sources of information is at the heart of the Wikipedia community’s ideals. The concept of all the information on Wikipedia being “verifiable”, connected to an editorially controlled source, like a reputable newspaper or academic journal, has helped focus the massive collaborative effort that Wikipedia represents.

This connection of Wikipedia’s information to sourcing, however, is an ideal; Wikipedia grows through the contributions of thousands of people every month, and we cannot practically expect every new editor to understand how Wikipedia relies on footnotes, how to find the right kinds of research material, or how to add those references to Wikipedia. All of these steps require not only a broader understanding of research, but how those skills apply to our context.

Unlike an average Wikipedia reader, librarians understand these skills intimately: not only do librarians have training and practical experience finding and integrating reference materials into written works, but they teach patrons these vital 21st-century information literacy skills every day. In the face of a flood of bad information, the health of Wikipedia relies not only on contributors, but community educators who can help our readers understand how our content is created. Ultimately, the skills and goals of the library community are aligned with the Wikipedia community.

That is why we are asking librarians to “Imagine a world where every librarian added one more reference to Wikipedia” as part of our second annual “1 Librarian, 1 Reference” (#1lib1ref) campaign. There are plenty of opportunities to get involved: there are over 313,000 “citation needed” statements on Wikipedia and 213,000 articles without any citations at all.

Last year, #1lib1ref spread around the world, helping over 500 librarians contribute thousands of citations, and sparking a conversation among library communities about what role Wikipedia has in the information ecosystem. Still, Wikipedia has over 40 million articles in hundreds of languages; though the hundreds of librarians made many contributions to the English, Catalan and a few other language Wikipedias, we need more to significantly change the experience of Wikipedia’s hundreds of millions of readers.

This year, we are calling on librarians the world over to make #1lib1ref a bigger, better contribution to a real-information based future. We are:

  • Supporting more languages for the campaign
  • Providing a kit to help organize gatherings of librarians to contribute and talk about Wikipedia’s role in librarianship.
  • Extending the campaign for another couple weeks, from January 15 until February 3.

Share the campaign in your networks and go to your library to ask your librarian to join in the campaign in the coming weeks, to contribute a precious Wikipedia birthday gift to the world: one more citation on Wikipedia!

Alex Stinson, GLAM-Wiki Strategist
Wikimedia Foundation

You can learn more about 1lib1ref at its campaign page.

by Alex Stinson at January 15, 2017 08:31 PM

Gerard Meijssen

#Wikipedia - Who is Fiona Hile?

When you look for Fiona Hile on the English Wikipedia, you will find this. It is a puzzle and there are probably two people by that name that do not have an article (yet).

One of them is an Australian poet. When you google for her you find among other things a picture. When you seek her information on VIAF you find two identifiers and in the near future she will have a third: Wikidata.

From a Wikidata point of view it is relevant to have an item for her because she won two awards. It completes these lists and it connects the two awards to the same person.

When you asks yourself is Mrs Hile really "notable", you find that the answer depends on your point of view. Wikipedia already mentions her twice and surely a discussion on the relative merits of notability is not everyone's cup of tea.

Why is Mrs Hile notable enough to blog about? It is a great example that Wikipedia and Wikidata together can produce more and better information.
Thanks,
      GerardM

by Gerard Meijssen (noreply@blogger.com) at January 15, 2017 07:40 PM

The Peter Porter Poetry Prize

For me the Peter Porter Poetry Prize is an award like so many others. There is one article, it lists the names of some of the people who are known to have won the prize. Some are linked and some are not. For one winner I linked to a German article and for a few others I created an item.

This list is complete, it has a link to a source so the information can be verified and I am satisfied with the result up to a point.

What I could do is add more awards and people who have won awards. The article for Tracy Ryan, the 2009 winner, has a category for another award that she won.  This award does not have a webpage with all the past winners so the question is; is Wikipedia good enough as a source. I added the winners to the award, made a mistake corrected it and now Wikidata knows about a Nathan Hobby.

Jay Martin is the 2016 winner of the  T.A.G. Hungerford Award. It has a source but it is extremely likely that this will disappear in  2017. The problem I have is that I want to see this information shared but all the work done to improve on Wikidata data is not seen at Wikipedia. When we share our resources and when we are better in tune with each others needs as editors, we will be better able to "share in the sum of our available knowledge".
Thanks,
      GerardM

by Gerard Meijssen (noreply@blogger.com) at January 15, 2017 12:20 PM

Is #Wikipedia the new #Britannica?

At the time the Britannica was best of breed. It was the encyclopaedia to turn to. Then Wikipedia happened and obviously it was not good enough, people were not convinced. When you read the discussions why Wikipedia was not good enough, there was however no actual discussion. The points of view were clear, they had consequences and it was only when research was done that Wikipedia became respectable. Its quality was equally good and it was more informative and included more subjects. The arguments did not go away the point of view became irrelevant. People and particularly students use Wikipedia.

Today Wikipedia is said to be best of breed. It is where you find encyclopaedic information and as Google rates Wikipedia content highly it is seen and used a lot by many people.

The need for information is changing. We have recently experienced a lot of misinformation and the need to know what is factually correct has never been more important. What has become clear is that arguments and information alone is not what sways people. So the question is where does that leave Wikipedia?

The question we have to ask is, what does it take to convince people, to be open minded. What to do when people expect a neutral point of view but the facts are unambiguous in one direction? What if the language used is not understood? What are the issues of Wikipedia, what are its weaknesses and what are its strength?

So far quality is considered to be found in sources, in the reputation of its writers. When this is not what convinces, how do we show our quality or better, how do we get people to reconsider and see the other point of view?
Thanks,
      GerardM

by Gerard Meijssen (noreply@blogger.com) at January 15, 2017 08:04 AM

January 13, 2017

Weekly OSM

weeklyOSM 338

01/03/2017-01/09/2017

Logo

New routing possiblities for wheelchairs 1 |

Mapping

  • Regio OSM, a completeness checker for addresses now checks 1702 communities and many cities in Germany, one of the 11 countries where the tool can be used.
  • An interesting combination of OpenData and OSM to improve the OSM data of schools in the UK. One drawback is that a direct link exists only to iD. If iD is open, however, you can open JOSM from there. 😉
  • Pascal Neis describes in a blog post his tools for QA in OSM
  • Arun Ganesh shows the significance of the wikidata=* tag by an example of the North Indian city of Manali. In his contribution, he also points to possibilities for improving OSM with further information via Wikidata, Wikimedia Commons, WikiVoyage and also points out information about using Wikidata with Mapbox tools.
  • The OSM Operations team announced a new feature on the main map page: Public GPS-Tracks.
  • Tom Pfeifer asks, how the quite modern form of cooperation, sharing of workspaces and equipment in the form of the coworking space should be tagged.
  • Chris uses AutoHotKey (Windows) and JOSM to optimize his mapping experience. He demonstrates this in a video, while tracing building outlines.
  • User rorym shows why it is useful not to make mechanical edits but “look at the area and look for other mistakes!”

Community

OpenStreetMap Foundation

Events

  • Klaus Torsky reports (de) (automatic translation) on the last FOSS4G in Germany. He links to an interview (en) with Till Adams the brain behind the organisation of FOSS4G in Bonn.
  • Frederik Ramm invites people for the February hack weekend happening in Karlsruhe.
  • A mapping party took place in Tombuctu took place from 7th to 9th of January.

Humanitarian OSM

  • Kizito Makoye reports on the initiative of the Dar es Salaam City Administration, Tanzania, in the floodplains of poor regions such as Tandal drones. The Ramani-Huria project supports this by implementing the acquired data in OSM-based maps. This and other measures will improve the living conditions and the infrastructure in the slum areas.

Maps

switch2OSM

  • Uber uses OpenStreetMap. Grant Slater expects Uber to contribute to OSM data.

Software

  • The Wikimedia help explains how to use the Wikidata ID to display the outline of OSM Objects in Wikimedia maps.
  • User Daniel writes a diary on how the latest release of Open Source Routing Machine (version 5.5) has made it easier to set up our own routing machine and shares some documentation related to it.

Releases

OpenStreetMap Routing Machine released version 5.5 which comes with some huge enhancements in guidance, tags, API and infrastructure.

Software Version Release date Comment
OSRM 5.5.0 2016-12-16 Navigation, tag interpretation and the API infrastructure have been improved.
JOSM 11427 2016-12-31 No info.
Mapillary Android * 3.14 2017-01-04 Much faster GPX fix.
Mapbox GL JS v0.30.0 2017-01-05 No info.
Naviki Android;* 3.52.4 2017-01-05 Accuracy improved.
Mapillary iOS * 4.5.11 2017-01-06 Improved onboarding.
SQLite 1.16.2 2017-01-06 Four fixes.

Provided by the OSM Software Watchlist.

(*) unfree software. See: freesoftware.

Did you know …

  • … the daily updated extracts by Netzwolf?
  • … your next holiday destination? If yes, then the map with georeferenced images in Wikimedia Commons is ideal to inform oneself in advance.
  • … the GPS navigator uNav for Ubuntu smartphones? This OSM-based Navi-App is now available in version 0.64 for the Ubuntu Mobile Operating System (OTA-14).

OSM in the media

  • Tracy Staedter (Seeker) explained the maps of Geoff Boeing. He calls his visualization tool OSMnx (OSM + NetworkX). The tool can create the physical characteristics of the streets of each city in a black & white grid, showing impressive historical city developments. Boeing says, “The cards help change opinions by demonstrating to people that the density of a city is not necessarily bad.”

Other “geo” things

  • The Open Traffic Partnership (OTP) is an initiative in Manila, Philippines which aims to make use of anonymized GPS data to analyze traffic congestion. The partnership has led to an open source platform – OSM is represented by Mapzen – that enables developing countries to record and analyze traffic patterns. Alyssa Wright, President of the US OpenStreetMap Foundation, said: “The partnership seeks to improve the efficiency and effectiveness of global transport use and supply through open data and capacity expansion.”
  • This is how the Mercator Projection distorts the poles.
  • Treepedia, developed by MIT’s Senseable City Lab and World Economic Forum, provides a visualization of tree cover in 12 major cities including New York, Los Angeles and Paris.

Upcoming Events

Where What When Country
Lyon Mapathon Missing Maps pour Ouahigouya 01/16/2017 france
Brussels Brussels Meetup 01/16/2017 belgium
Essen Stammtisch 01/16/2017 germany
Grenoble Rencontre groupe local 01/16/2017 france
Manila 【MapAm❤re】OSM Workshop Series 7/8, San Juan 01/16/2017 philippines
Augsburg Augsburger Stammtisch 01/17/2017 germany
Cologne/Bonn Bonner Stammtisch 01/17/2017 germany
Scotland Edinburgh 01/17/2017 uk
Lüneburg Mappertreffen Lüneburg 01/17/2017 germany
Viersen OSM Stammtisch Viersen 01/17/2017 germany
Osnabrück Stammtisch / OSM Treffen 01/18/2017 germany
Karlsruhe Stammtisch 01/18/2017 germany
Osaka もくもくマッピング! #02 01/18/2017 japan
Leoben Stammtisch Obersteiermark 01/19/2017 austria
Urspring Stammtisch Ulmer Alb 01/19/2017 germany
Tokyo 東京!街歩き!マッピングパーティ:第4回 根津神社 01/21/2017 japan
Manila 【MapAm❤re】OSM Workshop Series 8/8, San Juan 01/23/2017 philippines
Bremen Bremer Mappertreffen 01/23/2017 germany
Graz Stammtisch Graz 01/23/2017 austria
Brussels FOSDEM 2017 02/04/2017-02/05/2017 belgium
Genoa OSMit2017 02/08/2017-02/11/2017 italy
Passau FOSSGIS 2017 03/22/2017-03/25/2017 germany
Avignon State of the Map France 2017 06/02/2017-06/04/2017 france
Aizu-wakamatsu Shi State of the Map 2017 08/18/2017-08/20/2017 japan
Buenos Aires FOSS4G+SOTM Argentina 2017 10/23/2017-10/28/2017 argentina

Note: If you like to see your event here, please put it into the calendar. Only data which is there, will appear in weeklyOSM. Please check your event in our public calendar preview and correct it, where appropiate..

This weeklyOSM was produced by Peda, Polyglot, Rogehm, SeleneYang, SomeoneElse, SrrReal, TheFive, YoViajo, derFred, jinalfoflia, keithonearth, wambacher.

by weeklyteam at January 13, 2017 07:00 PM

Wikimedia Tech Blog

Importing JSON into Hadoop via Kafka

Photo by Eric Kilby, CC BY-SA 2.0.

Photo by Eric Kilby, CC BY-SA 2.0.

JSON is…not binary

JSON is awesome.  It is both machine and human readable.  It is concise (at least compared to XML), and is even more concise when represented as YAML. It is well supported in many programming languages.  JSON is text, and works with standard CLI tools.

JSON sucks.  It is verbose.  Every value has a key in every single record.  It is schema-less and fragile. If a JSON producer changes a field name, all downstream consumer code has to be ready.  It is slow.  Languages have to convert JSON strings to binary representations and back too often.

JSON is ubiquitous.  Because it is so easy for developers to work with, it is one of the most common data serialization formats used on the web [citation needed!].  Almost any web based organization out there likely has to work with JSON in some capacity.

Kafka was originally developed by LinkedIn, and is now an open source Apache project with strong support from Confluent.   Both of these organizations prefer to work with strongly typed and schema-ed data.  Their serialization format of choice is Avro.  Organizations like this have tight control over their data formats, as it rarely escapes outside of their internal networks.  There are very good reasons Confluent is pushing Avro instead of JSON, but for many, like Wikimedia, it is impractical to transport data in a binary format that is unparseable without extra information (schemas) or special tools.

The Wikimedia Foundation lives openly on the web and has a commitment to work with volunteer open source contributors.  Mediawiki is used by people of varying technical skill levels in different operating environments.  Forcing volunteers and Wikimedia engineering teams to work with serialization formats other than JSON is just mean!  Wikimedia wants our software and data to be easy.

For better or worse, we are stuck with JSON.  This makes many things easy, but big data processing in Hadoop is not one of them.  Hadoop runs in the JVM, and it works more smoothly if its data is schema-ed and strongly typed.  Hive tables are schema-ed and strongly typed.  They can be mapped onto JSON HDFS files using a JSON SerDe, but if the underlying data changes because someone renames a field, certain queries on that Hive table will break.  Wikimedia imports the latest JSON data from Kafka into HDFS every 10 minutes, and then does a batch transform and load process on each fully imported hour.

Camus, Gobblin, Connect

LinkedIn created Camus to import Avro data from Kafka into HDFS.   JSON support was added by Wikimedia.  Camus’ shining feature is the ability to write data into HDFS directory hierarchies based on configurable time bucketing.  You specify the granularity of the bucket and which field in your data should be used as the event timestamp.

However, both LinkedIn and Confluent have dropped support for Camus.  It is an end-of-life piece of software.  Posited as replacements, LinkedIn has developed Gobblin, and Kafka ships with Kafka Connect.

Gobblin is a generic HDFS import tool.  It should be used if you want to import data from a variety of sources into HDFS.  It does not support timestamp bucketed JSON data out of the box.  You’ll have to provide your own implementation to do this.

Kafka Connect is generic Kafka import and export tool, and has a HDFS Connector that helps get data into HDFS.  It has limited JSON support, and requires that your JSON data conform to a Kafka Connect specific envelope.  If you don’t want to reformat your JSON data to fit this envelope, you’ll have difficulty using Kafka Connect.

That leaves us with Camus.  For years, Wikimedia has successfully been using Camus to import JSON data from Kafka into HDFS.  Unlike the newer solutions, Camus does not do streaming imports, so it must be scheduled in batches. We’d like to catch up with more current solutions and use something like Kafka Connect, but until JSON is better supported we will continue to use Camus.

So, how is it done?  This question appears often enough on Kafka related mailing lists, that we decided to write this blog post.

Camus with JSON

Camus needs to be told how to read messages from Kafka, and in what format they should be written to HDFS.  JSON should be serialized and produced to Kafka as UTF-8 byte strings, one JSON object per Kafka message.  We want this data to be written as is with no transformation directly to HDFS.  We’d also like to compress this data in HDFS, and still have it be useable by MapReduce.  Hadoop’s SequenceFile format will do nicely.  (If we didn’t care about compression, we could use the StringRecordWriterProvider to write the JSON records \n delimited directly to HDFS text files.)

We’ll now create a camus.properties file that does what we need.

First, we need to tell Camus where to write our data, and where to keep execution metadata about this Camus job.  Camus uses HDFS to store Kafka offsets so that it can keep track of topic partition offsets from which to start during each run:

# Final top-level HDFS data output directory. A sub-directory
# will be dynamically created for each consumed topic.
etl.destination.path=hdfs:///path/to/output/directory

# HDFS location where you want to keep execution files,
# i.e. offsets, error logs, and count files.
etl.execution.base.path=hdfs:///path/to/camus/metadata

# Where completed Camus job output directories are kept,
# usually a sub-dir in the etl.execution.base.path
etl.execution.history.path=hdfs:///path/to/camus/metadata/history

Next, we’ll specify how Camus should read in messages from Kafka, and how it should look for event timestamps in each message.  We’ll use the JsonStringMessageDecoder, which expects each message to be  UTF-8 byte JSON string.  It will deserialize each message using the Gson JSON parser, and look for a configured timestamp field.

# Use the JsonStringMessageDecoder to deserialize JSON messages from Kafka.
camus.message.decoder.class=com.linkedin.camus.etl.kafka.coders.JsonStringMessageDecoder


camus.message.timestamp.field specifies which field in the JSON object should be used as the event timestamp, and camus.message.timestamp.format specifies the timestamp format of that field.  Timestamp interpolation is handled by Java’s SimpleDateFormat, so you should set camus.message.timestamp.format to something that SimpleDateFormat understands, unless your timestamp is already an integer UNIX epoch timestamp.  If it is, you should use ‘unix_seconds’ or ‘unix_milliseconds’, depending on the granularity of your UNIX epoch timestamp.

Wikimedia maintains a slight fork of JSONStringMessageDecoder that makes the camus.message.timestamp.field slightly more flexible.  In our fork, you can specify sub-objects using dotted notation, e.g. camus.message.timestamp.field=sub.object.timestamp. If you don’t need this feature, then don’t bother with our fork.

Here are a couple of examples:

Timestamp field is ‘dt’, format is an ISO-8601 string:

# Specify which field in the JSON object will contain our event timestamp.
camus.message.timestamp.field=dt

# Timestamp values look like 2017-01-01T15:40:17
camus.message.timestamp.format=yyyy-MM-dd'T'HH:mm:ss


Timestamp field is ‘meta.sub.object.ts’, format is a UNIX epoch timestamp integer in milliseconds:

# Specify which field in the JSON object will contain our event timestamp.
# E.g. { “meta”: { “sub”: { “object”: { “ts”: 1482871710123 } } } }
# Note that this will only work with Wikimedia’s fork of Camus.
camus.message.timestamp.field=meta.sub.object.ts

# Timestamp values are in milliseconds since UNIX epoch.
camus.message.timestamp.format=unix_milliseconds

If the timestamp cannot be read out of the JSON object, JsonStringMessageDecoder will log a warning and fall back to using System.currentTimeMillis().

Now that we’ve told Camus how to read from Kafka, we need to tell it how to write to HDFS. etl.output.file.time.partition.mins is important. It tells Camus the time bucketing granularity to use.  Setting this to 60 minutes will cause Camus to write files into hourly bucket directories, e.g. 2017/01/01/15. Setting it to 1440 minutes will write daily buckets, etc.

# Store output into hourly buckets.
etl.output.file.time.partition.mins=60

Use UTC as the default timezone.
etl.default.timezone=UTC

# Delimit records by newline.  This is important for MapReduce to be able to split JSON records.
etl.output.record.delimiter=\n


Use SequenceFileRecordWriterProvider if you want to compress data.  To do so, set mapreduce.output.fileoutputformat.compress.codec=Snappy (or another splittable compression codec) either in your mapred-site.xml, or in this camus.properties file.

# SequenceFileRecordWriterProvider writes the records as Hadoop Sequence files
# so that they can be split even if they are compressed.
etl.record.writer.provider.class=com.linkedin.camus.etl.kafka.common.SequenceFileRecordWriterProvider

# Use Snappy to compress output records.
mapreduce.output.fileoutputformat.compress.codec=SnappyCodec


Finally, some basic Camus configs are needed:

# Replace this with your list of Kafka brokers from which to bootstrap.
kafka.brokers=kafka1001:9092,kafka1002:9092,kafka1003:9092

# These are the kafka topics camus brings to HDFS.
# Replace this with the topics you want to pull,
# or alternatively use kafka.blacklist.topics.
kafka.whitelist.topics=topicA,topicB,topicC

# If whitelist has values, only whitelisted topic are pulled.
kafka.blacklist.topics=

There are various other camus properties you can tweak as well.  You can see some of the ones Wikimedia uses here.

Once this camus.properties file is configured, we can launch a Camus Hadoop job to import from Kafka.

hadoop jar camus-etl-kafka.jar com.linkedin.camus.etl.kafka.CamusJob -P /path/to/camus.properties -Dcamus.job.name="my-camus-job"


The first time this job runs, it will import as much data from Kafka as it can, and write its finishing topic-partition offsets to HDFS.  The next time you launch a Camus job with this with the same camus.properties file, it will read offsets from the configured etl.execution.base.path HDFS directory and start consuming from Kafka at those offsets.  Wikimedia schedules regular Camus Jobs using boring ol’ cron, but you could use whatever new fangled job scheduler you like.

After several Camus runs, you should see time bucketed directories containing Snappy compressed SequenceFiles of JSON data in HDFS stored in etl.destination.path, e.g. hdfs:///path/to/output/directory/topicA/2017/01/01/15/.  You could access this data with custom MapReduce or Spark jobs, or use Hive’s org.apache.hive.hcatalog.data.JsonSerDe and Hadoop’s org.apache.hadoop.mapred.SequenceFileInputFormat.  Wikimedia creates an external Hive table doing just that, and then batch processes this data into a more refined and useful schema stored as Parquet for faster querying.

Here’s the camus.properties file in full:

#
# Camus properties file for consuming Kafka topics into HDFS.
#

# Final top-level HDFS data output directory. A sub-directory
# will be dynamically created for each consumed topic.
etl.destination.path=hdfs:///path/to/output/directory

# HDFS location where you want to keep execution files,
# i.e. offsets, error logs, and count files.
etl.execution.base.path=hdfs:///path/to/camus/metadata

# Where completed Camus job output directories are kept,
# usually a sub-dir in the etl.execution.base.path
etl.execution.history.path=hdfs:///path/to/camus/metadata/history

# Use the JsonStringMessageDecoder to deserialize JSON messages from Kafka.
camus.message.decoder.class=com.linkedin.camus.etl.kafka.coders.JsonStringMessageDecoder

# Specify which field in the JSON object will contain our event timestamp.
camus.message.timestamp.field=dt

# Timestamp values look like 2017-01-01T15:40:17
camus.message.timestamp.format=yyyy-MM-dd'T'HH:mm:ss

# Store output into hourly buckets.
etl.output.file.time.partition.mins=60

# Use UTC as the default timezone.
etl.default.timezone=UTC

# Delimit records by newline.  This is important for MapReduce to be able to split JSON records.
etl.output.record.delimiter=\n

# Concrete implementation of the Decoder class to use
camus.message.decoder.class=com.linkedin.camus.etl.kafka.coders.JsonStringMessageDecoder

# SequenceFileRecordWriterProvider writes the records as Hadoop Sequence files
# so that they can be split even if they are compressed.
etl.record.writer.provider.class=com.linkedin.camus.etl.kafka.common.SequenceFileRecordWriterProvider

# Use Snappy to compress output records.
mapreduce.output.fileoutputformat.compress.codec=SnappyCodec

# Max hadoop tasks to use, each task can pull multiple topic partitions.
mapred.map.tasks=24

# Connection parameters.
# Replace this with your list of Kafka brokers from which to bootstrap.
kafka.brokers=kafka1001:9092,kafka1002:9092,kafka1003:9092

# These are the kafka topics camus brings to HDFS.
# Replace this with the topics you want to pull,
# or alternatively use kafka.blacklist.topics.
kafka.whitelist.topics=topicA,topicB,topicC

# If whitelist has values, only whitelisted topic are pulled.
kafka.blacklist.topics=

# max historical time that will be pulled from each partition based on event timestamp
#  Note:  max.pull.hrs doesn't quite seem to be respected here.
#  This will take some more sleuthing to figure out why, but in our case
#  here it’s ok, as we hope to never be this far behind in Kafka messages to
#  consume.
kafka.max.pull.hrs=168

# events with a timestamp older than this will be discarded.
kafka.max.historical.days=7

# Max minutes for each mapper to pull messages (-1 means no limit)
# Let each mapper run for no more than 9 minutes.
# Camus creates hourly directories, and we don't want a single
# long running mapper keep other Camus jobs from being launched.
# We run Camus every 10 minutes, so limiting it to 9 should keep
# runs fresh.
kafka.max.pull.minutes.per.task=9

# Name of the client as seen by kafka
kafka.client.name=camus-00

# Fetch Request Parameters
#kafka.fetch.buffer.size=
#kafka.fetch.request.correlationid=
#kafka.fetch.request.max.wait=
#kafka.fetch.request.min.bytes=

kafka.client.buffer.size=20971520
kafka.client.so.timeout=60000

# Controls the submitting of counts to Kafka
# Default value set to true
post.tracking.counts.to.kafka=false

# Stops the mapper from getting inundated with Decoder exceptions for the same topic
# Default value is set to 10
max.decoder.exceptions.to.print=5

log4j.configuration=false

##########################
# Everything below this point can be ignored for the time being,
# will provide more documentation down the road. (LinkedIn/Camus never did! :/ )
##########################

etl.run.tracking.post=false
#kafka.monitor.tier=
kafka.monitor.time.granularity=10

etl.hourly=hourly
etl.daily=daily
etl.ignore.schema.errors=false

etl.keep.count.files=false
#etl.counts.path=
etl.execution.history.max.of.quota=.8

Nuria Ruiz, Lead Software Engineer (Manager)
Andrew Otto, Senior Operations Engineer
Wikimedia Foundation

by Andrew Otto at January 13, 2017 06:05 PM

January 12, 2017

Wikimedia Foundation

Inspire Campaign’s final report shows achievements in gender diversity and representation within the Wikimedia movement

Photo by Flixtey, CC BY-SA 4.0.

Photo by Flixtey, CC BY-SA 4.0.

In March 2015, the Wikimedia Foundation launched its first Inspire Campaign with the goal of improving gender diversity within our movement. The campaign invited idea on how we as a movement could improve the representation of women within our projects, both in its content and as contributors.

The response and effort from volunteers has been remarkable.  Across the ideas that were funded, there was a diversity of approaches such as:

  • An edit-a-thon series in Ghana to develop content on notable Ghanaian women
  • A tool to track how the gender gap is changing on Wikipedia projects
  • A pilot on mentorship-driven editing between high school and college students

These and other initiatives have resulted in concrete and surprising outcomes, such as:

  • Creating or improving over 12,000 articles, including 126 new biographies on women,
  • Engaging women as project leaders, volunteers, experienced editors and new editors,
  • Correcting gender-related biases within Wikipedia articles.

As this campaign draws to a close we’d like to celebrate the work of our grant-funded projects: the leaders, volunteers, and participants who contributed (many whom were women), and the achievements that have moved us forward in addressing this topic.

Protecting user privacy through prevention

The internet can be a hostile place, and in this age of information, we have become more cautious about what we reveal about ourselves to others online. You can imagine then that in a campaign designed to attract women, privacy became a central concern for both program leaders and participants.

Program leaders were sensitive to this challenge, and cultivated spaces where women could contribute without compromising their need for privacy. For instance, we asked program leaders to report the number of women who attended their events. Many program leaders pushed back, citing the need to protect privacy. They raised two good points: that some editors choose not to disclose their gender online as a safety measure, and that by even associating their name or username with a public event designed for women, they could inadvertently compromise their privacy. Consequently, the total number of women who participated in these programs was underreported.

In spite of this conflict, it was clear that women were majority participants across funded projects.  In projects hosting multiple events for training or improving project content, such as those hosted by AfroCROWD in New York, the Linguistics Editathon series in Canada and the U.S., and WikiNeedsGirls in Ghana, well over 50% of participants were women across their own events.  Furthermore, in the mentorship groups formed through the Women of Wikipedia (WOW!) Editing Group, all 34 participants were women.  These women showed strong commitment as a result of the program, and in a follow-up survey, many of them wanted to continue contributing together with their mentorship group beyond the program.

Who is missing on Wikipedia?

There is an impressive amount of information on Wikipedia today: over 43 million articles across 284 languages. In English Wikipedia alone, there are over 5 million articles today. A fair number of these articles are dedicated to people: biographies about notable individuals amount to over 6.5 million articles, and this number continues to increase every year.

It can be difficult to see what is missing within this sea of information, and biographies are one well-defined area where the question of “Who is missing?” is particularly pertinent. Today,  biographies about women amount to just over 1 million articles across all languages. One million biographies out of 6 million biographies, or 16% of biographies in total. One million articles out of 43 million articles, or 2% of Wikipedia content in total (whereas 12% of Wikipedia content is biographies of men). This is one way to understand how women are underrepresented on Wikipedia today, and we know even less about the extent of underrepresentation for other non-male gender identities.

One Inspire grant sought to address the visibility of this issue through the development of a tool: Wikipedia Human Gender Indicators (WHGI). WHGI uses Wikidata to track the gender of newly-created biographies by language Wikipedia (and other parameters, such as country or date of birth of the individual), and provides reports in the form of weekly snapshots.

The project has seen solid evidence of usage since its completion. In February 2016 alone the site had approximately 4,000 pageviews and over 1,000 downloads of available datasets. The team also received important feedback from users on the tool: participants in WikiProject Women in Red—a volunteer project that has created more than 44,000 biographies about women—characterized the project as valuable to their work, as it helps them identify notable women to write about.

The first step to addressing a problem is to identify it. WHGI helps us to do that in a concrete, data-driven way.

Why does “Queen-Empress” redirect to “King-Emperor”?

Addressing the gender gap goes beyond addressing gaps in content. It includes igniting conversations and addressing bias in content, bias might be more subtle or even unseen to casual readers.

Just for the record is an ongoing Gender Gap Inspire grant that focuses on these more subtle forms of content bias on English Wikipedia. One of their events analyzed the process of Wikipedia editing to investigate the possibilities and challenges of gender-neutral writing.

They specifically looked at how pages are automatically re-directed to others (e.g. “Heroine” automatically re-directs to “Hero”) and the direction of those redirects: female to gender-neutral, male to gender-neutral, female to male, male to female. An analysis of almost 200 redirects on English Wikipedia showed that ~100 direct from male/female terms to gender-neutral terms, and ~100 from female to male terms.  For example, “Ballerina” redirects to “Ballet dancer” and “Queen-Empress” redirects to “King-Emperor”.

These redirections may seem like minor technical issues, but they result in an encyclopedia that is rife with systemic bias. Raising awareness of these types of bias, starting discussions on and off wiki, and directly editing language were some of the main approaches Inspire grantees took to address bias.

Learn more!

These and other outcomes can be read in more detail in our full report. We encourage you to read on, learn more about what our grantees achieved, and join us in celebrating these project leaders and their participants! You can also learn more about Community Resources’ first experiment with proactive grantmaking and what we learned from this iteration.

Sati Houston, Grants Impact Strategist
Chris Schilling, Community Organizer
Wikimedia Foundation

by Sati Houston and Chris Schilling at January 12, 2017 06:26 PM

Gerard Meijssen

#Wikidata - Clare Hollingworth and #sources

Mrs Hollingworth was a famous journalist. She recently passed away and as I often do, I added content to the Wikidata item of the person involved.

Information like awards are something I often add and it was easy enough to establish that Mrs Hollingworth received a Hannen Swaffer Award in 1962. I found a source for the award and I had my confirmation.

The Wikipedia article has it that "She won the James Cameron Award for Journalism (1994)." There is however no source and I can find a James Cameron lecture and award but Mrs Hollingworth is not noted as receiving this award; it is Ed Vulliamy.

People often say that Wikipedia is not a source. The problem is that for Wikidata it often is. Particularly in the early days of Wikidata massive amounts of data were lifted off the Wikipedias and it is why there is so much initial content to build upon.

When you work from sources, you find an issue with the Wikipedia content. My source does not know about Mr Paul Foot either. Mrs Lyse Doucet does have a Wikipedia article but she is not linked in the Wikipedia list.

To truly get to the bottom of issues like these takes research and, I am willing nor able to do this for each and every subject that I touch. It is impossible to work on all the issues that exist because of everything that I did. I have over 2,1 million edits on Wikidata. What I do is make a start and I am happy to be condemned for the work that I did, work that does have issues but they are all there to be solved someday.
Thanks,
      GerardM

by Gerard Meijssen (noreply@blogger.com) at January 12, 2017 08:54 AM

Resident Mario

Wikimedia Foundation

Writing Ghana into Wikipedia: Felix Nartey

Editor’s note: We’re testing new video coding that will allow Safari and Edge users to play videos like Felix’s directly on the blog. Please bear with us. In the meantime, you can see the video on Wikimedia Commons.

Video by Victor Grigas, CC BY-SA 3.0.

The chaos of the Second World War touched all corners of the world, and the Gold Coast (now Ghana) was no exception. The colony’s resources were marshaled for the war effort, and the headquarters of Britain’s West Africa Command were located in Accra. Nearly 200,000 soldiers from the area would eventually sign up to serve in various branches of the armed forces.

In support of these efforts, No. 37 General Hospital was established in Accra to provide medical care for injured Allied troops. The hospital’s name was changed to 37 Military Hospital of the Gold Coast shortly before the colony gained independence. It is now open to the public and serves as a teaching hospital for graduate and medical students.

Despite its impact on healthcare in Accra, there was no Wikipedia article about the hospital until February 2014, when Wikipedian Felix Nartey created it.

Photo by Ruby Mizrahi/Wikimedia Foundation, CC BY-SA 3.0.

Photo by Ruby Mizrahi/Wikimedia Foundation, CC BY-SA 3.0.

“I was walking with a friend from an event when we just thought there was no picture of this place on Wikimedia Commons, the free media repository,” says Nartey, who also served as the former community manager for the Wikimedia Ghana user group. “My friend took a picture with my tablet, and so did I, then we headed home.”

Nartey wanted to use his photo in the hospital’s article on Wikipedia, but was shocked to find no article for it: the search results page told him that “no results” matched his search term, but that he could create a page about it.

He created a short “stub” article that night. Only a few weeks later, two other editors expanded it into an informative entry that pleasantly surprised him the next time he visited the article.

Nartey believes that knowledge sharing activities like editing Wikipedia have an effect on people’s lives, but at the present time there are significant content gaps on the site. There are fewer articles about topics on the African continent than there are about Europe or North America, and those that do exist tend to be shorter and less detailed.

Increasing the diversity of contributions on Wikipedia helps achieve higher-quality content and combat systemic bias, and that’s why many people—including Nartey—are trying to figure out the reasons behind these gaps and are putting forth great effort to bridge them.

In Ghana, a current paucity of contributors could “be partly blamed on the current growing unemployment situation, which is certainly an impediment to people’s willingness to do things for free,” says Nartey. But even in better times, while “the internet connection is reasonable in urban areas, it’s expensive, so people tend to go without it or place it last on their list of priorities, which, of course, affects contributions to Wikipedia.”

But when Nartey and other volunteers start editing Wikipedia, the positive energy created works as an incentive for them to maintain their contributions. He explains:

The only way you can have an impact in this world is to always leave something behind from where you came from and give back to society, whatever that means for you… That is the feeling I get whenever I edit Wikipedia. And I feel like it’s the joy of every Wikipedian to really see your impact.

 
In addition to editing, Nartey leads several initiatives in Ghana where he promotes the importance of editing Wikipedia. Some examples include GLAM activities, the Wikipedia Education Program, the Wikipedia Library, etc. In these activities, Nartey speaks with students, cultural organization officials, and Wikipedians to find the best ways to encourage people from his country to contribute.

“I mostly teach people about the essence of wanting to contribute to Wikipedia,” Nartey explains. “Information itself is useless until it’s shared with the whole world. And the only way you can do that is through a medium like Wikipedia. You need to [highlight] that essence in the minds of people and inspire them to contribute to Wikipedia. It’s easy for you to tap in and just tell someone you need to do this because Wikipedia is already creating that impact.”

Interview by Ruby Mizrahi, Interviewer
Outline by Tomasz Kozlowski, Blog Writer, Wikimedia Foundation
Profile by Samir Elsharbaty, Digital Content Intern, Wikimedia Foundation

by Ruby Mizrahi, Tomasz Kozlowski and Samir Elsharbaty at January 12, 2017 12:16 AM

January 10, 2017

Wikimedia Foundation

Wikimedia Foundation joins EFF and others encouraging the California Court of Appeal to protect online free speech

Photo by Coolcaesar, CC BY-SA 3.0.

Photo by Coolcaesar, CC BY-SA 3.0.

On Tuesday, January 10, 2016, the Wikimedia Foundation joined an amicus brief filed by the Electronic Frontier Foundation, Eric Goldman, Rebecca Tushnet, the Organization for Transformative Works, Engine, GitHub, Medium, Snap, and Yelp encouraging the California Court of Appeal to review the ruling of the trial court in Cross v. Facebook. The case involves important principles of freedom of speech and intermediary liability immunity (which shields platforms like Wikimedia, Twitter, and Facebook from liability for content posted by users), both essential to the continued health of the Wikimedia projects.

The case began when users on Facebook created a Facebook page which criticized the plaintiff, a musician, based on his business practices. The plaintiff (along with the label and marketing companies that represented him) brought suit against Facebook with a number of claims including misuse of publicity rights. The trial court denied Facebook’s anti-SLAPP motion  and found that the plaintiff could assert a right of publicity claim against Facebook. Worryingly, under the trial court’s reasoning, such a claim arises for any speech on social media that is: (i) about a real person; and (ii) published on a website that includes advertisements. In other words, a platform that carries advertising can be held liable for the speech of its users merely because this speech relates to a real person. The court’s reasoning is not consistent with well-established rules for limits to online speech.

Facebook filed an appeal against this ruling before the California Court of Appeal where the case is currently pending. In our amicus brief, we encourage the Court of Appeal to review the lower court’s decision by pointing to the legal and policy consequences of the lower court’s ruling.

We and our co-signers argue that the court reached this absurd result through two major errors in its reasoning. First, the court did not follow the well-established First Amendment limits to the right of publicity. Second, the court did not correctly apply the immunity granted in CDA Section 230. Congress enacted Section 230 to encourage the development of the internet and other interactive media by shielding intermediaries not only from liability for actionable content created or posted by users, but also from the cost and uncertainty associated with litigation itself. This framework is essential to the success of the Wikimedia projects and many other major websites across the internet that host user-generated content. If allowed to stand, a social media site such as Facebook, Twitter, or Tumblr can be sued for any post about a real person made by a user, ultimately undermining congressional intent.

We hope that the California Court of Appeal will protect the First Amendment right to comment on and criticize public figures. We also urge the court will uphold the immunity granted under US law to intermediaries that enables robust free speech and has become a fundamental pillar in the architecture of the internet.

Tarun Krishnakumar, Legal Fellow
Stephen LaPorte, Senior Legal Counsel
Wikimedia Foundation

Special thanks to the Electronic Frontier Foundation for drafting this amicus brief, and for Aeryn Palmer for leading the Wikimedia Foundation’s contribution

by Tarun Krishnakumar and Stephen LaPorte at January 10, 2017 11:46 PM

This month in GLAM

This Month in GLAM: December 2016

by Admin at January 10, 2017 11:06 AM

Jeroen De Dauw

PHP 7.1 is awesome

PHP 7.1 has been released, bringing some features I was eagerly anticipating and some surprises that had gone under my radar.

New iterable pseudo-type

This is the feature I’m most exited about, perhaps because I had no clue it was in the works. In short, iterable allows for type hinting in functions that just loop though their parameters value without restricting the type to either array or Traversable, or not having any type hint at all. This partially solves one of the points I raised in my Missing in PHP 7 series post Collections.

Nullable types

This feature I also already addressed in Missing in PHP 7 Nullable return types. What somehow escaped my attention is that PHP 7.1 comes not just with nullable return types, but also new syntax for nullable parameters.

Intent revealing

Other new features that I’m excited about are the Void Return Type and Class Constant Visibility Modifiers. Both of these help with revealing the authors intent, reduce the need for comments and make it easier to catch bugs.

A big thank you to the PHP contributors that made these things possible and keep pushing the language forwards.

For a full list of new features, see the PHP 7.1 release announcement.

by Jeroen at January 10, 2017 10:23 AM

January 09, 2017

Wikimedia Tech Blog

Wikimedia Foundation receives $3 million grant from Alfred P. Sloan Foundation to make freely licensed images accessible and reusable across the web

Photo by Ajepbah, CC BY-SA 3.0 DE.

Photo by Ajepbah, CC BY-SA 3.0 DE.

The Wikimedia Foundation, with a US$3,015,000 grant from the Alfred P. Sloan Foundation, is leading an effort to enable structured data on Wikimedia Commons, the world’s largest repository of freely licensed educational media. The project will support contributors’ efforts to integrate Commons’ media more readily into the rest of the web, making it easier for people and institutions to share, access, and reuse high-quality and free educational content.

Wikimedia Commons includes more than 35 million freely licensed media files—photos, audio, and video—ranging from stunning photos of geographic landscapes to donations from institutions with substantial media collections, like the Smithsonian, NASA, and the British Library. Like Wikipedia, Wikimedia Commons has become a “go-to” source on the internet—used by everyone from casual browsers to major media outlets to educational institutions, and easily discoverable through search engines. It continues to rapidly grow every year: Volunteer contributors added roughly six million new files in 2016.

Today, the rich images and media in Wikimedia Commons are described only by casual notation, making it difficult to fully explore and use this remarkable resource. The generous contribution from the Sloan Foundation will enable the Wikimedia Foundation to connect Wikimedia Commons with Wikidata, the central storage for structured data within the  Wikimedia projects. Wikidata will empower Wikimedia volunteers to transform Wikimedia Commons into a rich, easily-searchable, and machine-readable resource for the world.

Over three years, the Wikimedia Foundation will develop infrastructure, tools, and community support to enable the work of contributors, who have long requested a way to add more precise, multilingual and reusable data to media files. This will support new uses of Commons’ media, from richer and more dynamic illustration of articles on Wikipedia, to helping new users, like museums, remix the media in their own applications. Structured data will also be compatible with and support Wikimedia Commons’ partnership communities, including “GLAM” institutions (galleries, libraries, archives, museums) that have donated thousands of images in recent years. With the introduction of structured data on Commons, GLAM institutions will be able to more easily upload media and integrate existing metadata into Wikimedia Commons and share their collections with the rest of the web.

“At Wikimedia, we believe the world should have access to the sum of all knowledge, from encyclopedia articles to archival images,” said Katherine Maher, Executive Director of the Wikimedia Foundation. She continued:

Wikimedia Commons is a vast library of freely licensed photography, video, and audio that illustrates knowledge and the internet itself. With this project, and in partnership with the Wikimedia community of volunteer contributors, we hope to expand the free and open internet by supporting new applications of the millions of media files on Wikimedia Commons. We are grateful for the generous support of the Sloan Foundation, our longtime funders, in this important work.

 
“We are delighted to continue our near-decade-long support of Wikimedia with this potentially game-changing grant to unlock millions of media files—the most common form of modern communication and popular education, growing exponentially each year—into a universal format that can be read and reused not just by Wikipedia’s hundreds of millions of readers in nearly 300 languages but by educational, cultural and scientific organizations and by anyone doing a Google search or using the web,” said Doron Weber, Vice President and Program Director at the Alfred P. Sloan Foundation.

At a time when the World Wide Web, like the rest of the world, is beset by increasing polarization, commercialization, and narrowing, Wikipedia continues to serve as a shining, global counter-example of open collaborative knowledge sharing and consensus building presented in a reliable context with a neutral point of view, free of fake news and false information, that emphasizes how we can come together to build the sum of all human knowledge. We all need Wikipedia, its sister projects, its technology, and its values, now more than ever.

 
The Wikimedia Foundation is partnering on this project with Wikimedia Germany (Deutschland), the independent nonprofit organization dedicated to supporting the Wikimedia projects in Germany. Wikimedia Germany incubated and oversaw Wikidata’s initial operations, and continues to manage Wikidata’s technical and engineering roadmap. The project will be overseen in consultation with the Wikimedia community of volunteer contributors on collaboration and community support. The USD$3,015,000 grant from the Sloan Foundation will be given over a three year period.

For more information, please visit the structured data page on Wikimedia Commons.

by Wikimedia Foundation at January 09, 2017 07:41 PM

January 08, 2017

Jeroen De Dauw

Rewriting the Wikimedia Deutschland fundraising

Last year we rewrote the Wikimedia Deutschland fundraising software. In this blog post I’ll give you an idea of what this software does, why we rewrote it and the outcome of this rewrite.

The application

Our fundraising software is a homegrown PHP application. Its primary functions are donations and membership applications. It supports multiple payment methods, needs to interact with payment providers, supports submission and listing of comments and exchanges data with another homegrown PHP application that does analysis, reporting and moderation.

fun-app

The codebase was originally written in a procedural style, with most code residing directly in files (i.e., not even in a global function). There was very little design and completely separate concerns such as presentation and data access were mixed together. As you can probably imagine, this code was highly complex and very hard to understand or change. There was unused code, broken code, features that might not be needed anymore, and mysterious parts that even our guru that maintained the codebase during the last few years did not know what they did. This mess, combined with the complete lack of a specification and units tests, made development of new features extremely slow and error prone.

derp-code

Why we rewrote

During the last year of the old application’s lifetime, we did refactor some parts and tried adding tests. In doing so, we figured that rewriting from scratch would be easier than trying to make incremental changes. We could start with a fresh design, add only the features we really need, and perhaps borrow some reusable code from the less horrible parts of the old application.

They did it by making the single worst strategic mistake that any software company can make: […] rewrite the code from scratch. —Joel Spolsky

We were aware of the risks involved with doing a rewrite of this nature and that often such rewrites fail. One big reason we did not decide against rewriting is that we had a time period of 9 months during which no new features needed to be developed. This meant we could freeze the old application and avoid parallel development, resulting in some kind of feature race. Additionally, we set some constraints: we would only rewrite this application and leave the analysis and moderation application alone, and we would do a pure rewrite, avoiding the addition of new features into the new application until the rewrite was done.

How we got started

Since we had no specification, we tried visualizing the conceptual components of the old application, and then identified the “commands” they received from the outside world.

old-fun-code-diagram

Creating the new software

After some consideration, we decided to try out The Clean Architecture as a high level structure for the new application. For technical details on what we did and the lessons we learned, see Implementing the Clean Architecture.

The result

With a team of 3 people, we took about 8 months to finish the rewrite successfully. Our codebase is now clean and much, much easier to understand and work with. It took us over two man years to do this clean up, and presumably an even greater amount of time was wasted in dealing with the old application in the first place. This goes to show that the cost of not working towards technical excellence is very high.

We’re very happy with the result. For us, the team that wrote it, it’s easy to understand, and the same seems to be true for other people based on feedback we got from our colleagues in other teams. We have tests for pretty much all functionality, so can refactor and add new functionality with confidence. So far we’ve encountered very few bugs, with most issues arising from us forgetting to add minor but important features to the new application, or misunderstanding what the behavior should be and then correctly implementing the wrong thing. This of course has more to do with the old codebase than with the new one. We now have a solid platform upon which we can quickly build new functionality or improve what we already have.

The new application is the first Wikimedia (Deutschland) deployed on, and wrote in, PHP7. Even though not an explicit goal of the rewrite, the new application has ended up with better performance than the old one, in part due to the PHP7 usage.

Near the end of the rewrite we got an external review performed by thePHPcc, during which Sebastian Bergmann, who you might know from PHPUnit fame, looked for code quality issues in the new codebase. The general result of that was a thumbs up, which we took the creative license to translate into this totally non-Sebastian approved image:

You can see our new application in action in production. I recommend you try it out by donating 🙂

Technical statistics

These are some statistics for fun. They have been compiled after we did our rewrite, and where not used during development at all. As with most software metrics, they should be taken with a grain of salt.

In this visualization, each dot represents a single file. The size represents the Cyclomatic complexity while the color represents the Maintainability Index. The complexity is scored relative to the highest complexity in the project, which in the old application was 266 and in the new one is 30. This means that the red on the right (the new application) is a lot less problematic than the red on the left. (This visualization was created with PhpMetrics.)

fun-complexity

Global access in various Wikimedia codebases (lower is better). The rightmost is the old version of the fundraising application, and the one next to it is the new one. The new one has no global access whatsoever. LLOC stands for Logical Lines of Code. You can see the numbers in this public spreadsheet.

global-access-stats

Static method calls, often a big source of global state access, where omitted, since the tools used count many false positives (i.e. alternative constructors).

The differences between the projects can be made more apparent by visualizing them in another way. Here you have the number of lines per global access, represented on a logarithmic scale.

lloc-per-global

The following stats have been obtained using phploc, which counts namespace declarations and imports as LLOC. This means that for the new application some of the numbers are very slightly inflated.

  • Average class LLOC: 31 => 21
  • Average method LLOC: 4 => 3
  • Cyclomatic Complexity / LLOC : 0.39 => 0.10
  • Cyclomatic Complexity / Number of Methods: 2.67 => 1.32
  • Global functions: 58 => 0
  • Total LLOC: 5517 => 10187
  • Test LLOC: 979 => 5516
  • Production LLOC: 4538 => 4671
  • Classes 105 => 366
  • Namespaces: 14 => 105

This is another visualization created with PhpMetrics that shows the dependencies between classes. Dependencies are static calls (including to the constructor), implementation and extension and type hinting. The applications top-level factory can be seen at the top right of the visualization.

fun-dependencies

by Jeroen at January 08, 2017 09:02 AM

January 07, 2017

User:Bluerasberry

Year of Science 2016 – a new model for Wikipedia outreach

The Year of Science was a 2016 Wikipedia outreach campaign managed by the Wiki Education Foundation with funding support from the Simons Foundation. The campaign had several goals, including developing science articles on Wikipedia, recruiting scientists as volunteer Wikipedia editors, promoting discussions about the culture and impact of Wikipedia in the scientific community, and integrating more science themes into existing Wikipedia community programs.

It is easy to say that the Year of Science was one of the biggest and highest impact campaigns which the Wikipedia community has produced to date. Previous campaigns rarely lasted more than a month, and campaigns rarely include multiple events in multiple cities or recruited so many participants. It is unprecedented for any Wikipedia campaign to bring so many discussions to professional spaces, but Year of Science included talks and workshops at academic conferences throughout the year. The very brand and idea of a “year of science” was provocative to see in circulation around Wikipedia, and pushed the community’s imagination of what is possible.

The campaign will have its own outcome reports and counts of progress. 2016 just ended so these are not available yet. When they come out, they will describe the counts of how many people attended workshops and registered Wikipedia articles to add citations to academic journals. With Wikipedia being digitally native, so many metrics are available. That part of the impact can be measured quantitatively. Beyond that I am confident that the social outreach changes the cultural posturing in science to Wikipedia, which I think is overdue to change. Right now, Wikipedia is riding a 10+-year wave of being the world’s most consulted source of science information. Assuming that Wikipedia survives into the future, I think people might look back and wonder when Wikipedia’s influence as a popular publication was recognized, and this Year of Science campaign might be cited as one of the first examples of professional Wikipedia outreach into a population of people who still had serious reservations about acknowledging Wikipedia at all. It was a risk to do Year of Science in 2016; 2014 or before would have been premature considering Wikipedia’s reputation then. Although things are better now, things are changing quickly and every year outreach like this is becoming easier to conduct and more likely to have a high impact with less effort.

I am pleased with the campaign outcomes. From a Wikimedia community perspective of wanting to keep what worked and spend less time repeating the parts which were less effective, the campaign could be criticized, but I do not think the criticism should detract from celebrating everything that everyone accomplished. Most parts of the program were successful, and I expect that other stakeholders will publish to describe those parts. For the sake of anyone who might want to do similar projects, I will review the challenges.

Metrics are incomplete
The Wikipedia community values transparency. However, many people in the Wikipedia community stay in digital spaces and underestimate the difficulty of doing outreach away from the keyboard. The Year of Science tracked as much of Wikipedia engagement as is routine to track in outreach programs, but from anecdotes, I know that much and perhaps most data was not captured. There are various reasons for this. One reason is that Wikipedia’s software is nonprofit and rooted in the late 1990s, whereas commercial websites have all the advantages of being state of the art and intuitive to use. Wikipedia’s clunky interface and infrastructure is a barrier to getting users to agree to the lamest parts of Wikipedia, like volunteering for metrics tracking. In platforms like Facebook, every aspect of people’s lives are tracked routinely with single clicks, but in Wikipedia, there are social options to preserve privacy and then technical limitations even for people who are sharing what they do. The idea for metrics tracking in a program like this is that if someone volunteers to report to a campaign organizer which Wikipedia article they edited, then we ought to be able to track that. For a campaign like this, we actually need to be able to track hundreds or thousands of participants. What happens in practice is that for various reasons, this tracking connection is difficult to make in Wikipedia for reasons which are not present in other organizing platforms. This is simultaneously a problem, and an intentional choice with its own rights-preserving benefits, and a social situation on which to reflect. Something that came out of this is development of the Programs & Events dashboard. I think that the P&E dashboard could prove to be of the most significant innovations to Wikipedia in its entire history, because the dashboard is the first effort to provide a system for collecting media metrics reporting the impact of Wikipedia. When stories about Wikipedia communication metrics are told, then I think the Year of Science should be remembered as one of the second-wave driving forces in the development of the concept.

Some experiments failed to develop
In typical wiki-fashion, the beginning of the campaign was treated as a call for all sorts of sub-projects. Should the campaign include a contest, a newsletter, collaborations with 10+ ongoing Wikipedia initiatives, and formal partnerships with respected science organizations? As it happens, Wikipedia is an improvised project which changes quickly depending on participant interest. When a few people want something, they start to create it, and some kinds of communication which worked well for offline activism – like newsletters – can seem slow in the age of Internet. Wikipedia does have some newsletters, but just in the same way that The New York Times publishes online first and only puts yesterday’s news in the latest edition of their paper publications, things like newsletters for digital communities can have low relevance for people who are living the experience. The Year of Science campaign ambitiously listed a range of projects, but many never materialized, and things that did not seem important in the inception of the idea became important months later. Insiders of a campaign often hesitate to definitively strike an idea which is not progressing, but for this campaign, I think some of the ideas which were raised in the beginning looked quite dead to both Wikipedians and science professionals who might have checked the campaign page. Wikipedia has trouble managing timed campaigns, because it is difficult to crowdsource the management of projects which must happen on a schedule. Wiki-style editors will not be bold enough to go into a campaign space started by another and tell them that they need to abandon certain halted projects, and the leadership of a campaign might not be able to recognize when enough time has passed to declare an initiative dead. By the end of the year, the campaign page accumulated some distracting cruft. Anyone replicating the campaign should plan in advance how to introduce new ideas to stay current and how to kill off paused concepts to prevent being overburdened. I would recommend by making modest promises in the beginning, introducing supplementary projects without prior announcement as a bonus rather than in fulfillment of a commitment, and not advertising any non-essential feature or service as ongoing and dependable until and unless that feature has already been provided in several iterations over a period of time.

No centralized forum
The idea of a centralized outreach campaign in Wikipedia is a little crazy. Wikipedia was imagined from its founding as a crowdsourced project in which any individual can contribute information, and other people can spontaneously organize to review and manage it according to rules which are developed by consensus. At no point in Wikipedia’s history has there been much concept of centralized leadership or even support. With Year of Science, there were outreach events in every way possible targeting individuals who would do anything, including editing articles, providing review and suggestions, developing the Wikidata database, or joining conversations. Beyond individuals all sorts of organizations external to Wikipedia participated, including conference teams, universities, social groups, and professional societies.

Although there was a campaign landing page to orient anyone to the Year of Science concept, the Wikipedia community is not accustomed to anticipating the existence of this kind of central campaign or using such forums provided by a campaign as a way to connect to sponsored support services. In some ways, Wiki Ed as an organization provided staff support for the outreach by setting up some basic infrastructure to make the campaign possible. Things that any traditional off-wiki outreach campaign would imagine to be essential – like logos, basic text instructions, sign-up sheets, reporting queues, designated talk pages, and some points of contact – are not aspects of Wikipedia community culture which the wiki community expects to exist in the wiki campaigns which have been successful to date. There is a cultural mismatch in what a science professional would expect to exist in a social campaign and what the wiki community imagines should exist. The organizers of the Year of Science campaign imagined the campaign landing page to be a bridge for this, and it was, but the concept of a traditional community entry point has not developed in the wiki community to a point which permits two-way communication between the Wikipedia community and people communicating in other ways. This is not a problem unique to Wikipedia, as people not familiar with communication in YouTube, Facebook, Twitter, or any other digital community platform have trouble moving messaging into and and getting comments out of those platforms as well. With Wikipedia, the paths to communication are less developed, and the Year of Science pushed to test what was possible.

For future campaigns, as outreach becomes broader, there could be more notice of what central services are and are not available. The Wikipedia community will tend to anticipate that there is no central service; off-wiki communities will tend to expect that there will be. Both communities will have challenges grasping the reality which is in the middle of these expectations. The centralized support which is available should be ready to promote services to those not expecting them, and preemptively match the support requested by off-wiki communities to what is available.

Take aways
Let’s do it again! The very precedent of the Year of Science is good for me in my medical outreach, because the credibility it generated gives me more of a foundation to to go further. This kind of campaign could be repeated globally in all languages for a year, or anyone could modify the concept to be local in one language and for a shorter time if that suited them. I would like to see more science themed campaigns. I can imagine other people exploring campaigns with themes in the humanities, for trades and labor, by geographical interest, for content types like datasets, or for engagement types like translation. I expect that now that this has happened, the next campaign organizers will be more informed going into the project now that the risk has been taken.

This entire experience also marks one of the first times that content sponsorship has been provided, albeit in the wiki way. It is not at all orthodox right now for anyone to fund wiki development, but not only did Simons Foundation do this, but they even let it happen in the wiki way: with invitations for any person or organization to contribute and to share the information which was important to them, as a volunteer, and without any promotional agenda.

by bluerasberry at January 07, 2017 07:58 PM

Gerard Meijssen

#Maps - Where did they live?

This map is in many ways perfect. It tells us a story. It helps visualise what happened in the past. The map is simple, they are the contours of present day Europe, more or less and in it you see roughly where what happened.

Obviously the map could be improved but typically it makes little difference for understanding what it is that is shown when it is seen in isolation.

When this map is part of a continuum of maps, it will show the movements over time. It will show where they are at a given time. They will show where the Vandals settled down and show where they fought their battles. Better understanding will emerge but it may get complicated. The Vandals were not the only ones around. It was a time of turmoil and only when the shape of former countries and battles are shown a better understanding emerges.

For many "former countries" maps are not available and when they are they are of a similar quality as the map of the Vandals. What I would love is maps as an overlay and just add maps and facts as they are available. Many maps will only over time get some credibility but it is an improvement over nothing to see.
Thanks,
       GerardM

by Gerard Meijssen (noreply@blogger.com) at January 07, 2017 07:44 AM

January 06, 2017

Wikimedia Foundation

The end of ownership? Rethinking digital property to favor consumers at a Yale ISP talk

Photo by Nick Allen, CC BY-SA 3.0.

Photo by Nick Allen, CC BY-SA 3.0.

Suppose a consumer named Alice buys a record of a David Bowie music album. Although Alice is not an expert in property law, she probably knows what privileges she enjoys by buying the LP record. For example, Alice can freely lend or rent it to her friend. Alice also possesses similar rights if she were to buy a book. But what happens when Alice buys an e-book or a song on iTunes? Can she enjoy the same rights with the e-book as she could with the book? Can she lend it or rent it to whoever she wants without any restrictions? Probably not. In the online world, users’ rights on digital copies and content are subject to licensing and technological restrictions imposed by copyright holders.

How we approach content licensing is critical for the Wikimedia projects and our mission of spreading free knowledge around the world. To help us better understand current research on this issue, I recently attended a talk on this subject hosted by our friends and collaborators at the Yale Information Society Project. The talk, entitled The End of Ownership: Personal Property in the Digital Economy, was given on November 3, 2016 by Professor Aaron Perzanowski of Case Western Reserve University.

Intellectual property, including copyright, is governed by the principle of exhaustion, also called the first-sale doctrine. This principle, established in Bobbs-Merrill Co. v. Straus in 1908, holds that copyright holders lose their ability to control further sales over their copyrighted works once they transfer the works to new owners. Bobbs-Merrill, the plaintiff and a publisher, drafted a notice in its books forbidding sales under one dollar, warning that violations of this condition would be considered copyright infringement. The defendants resold the books for less than a dollar each. In the end, the United States Supreme Court agreed with the defendants’ position and sent a clear message that copyright holders are not able to control prices or resales after the first sale of the copyrighted work. Even today, the first-sale doctrine is an important defense for consumers. In Alice’s case, copyright holders can control any use of the physical copy of the Bowie LP record until the first sale to Alice. Once Alice owns the record, she can re-sell it, donate it, etc.

But according to Perzanowski, the notion of property has changed: in the past, copies used to be valuable because they were scarce and difficult to produce. In the internet era, the paradigm has shifted. Because everything disseminates quickly and at almost no cost, copies have lost value. This is why buying in the digital world is a different experience. If Alice wants to buy an e-book on Amazon or an album on iTunes, she is not actually buying the “copy” of such e-book or album, but instead is licensing them. According to Perzanowski, these licensing schemes are undermining consumers’ rights that once were protected by ownership. Generally, the license terms of digital products will include the prohibition against transfer and sublicensing, among other restrictions. Thus, the notion of a copy of a work is disappearing, because in these licensing schemes, rights that are obvious in a physical object like resale, rental, or donation rights are neglected. Furthermore, these restrictions are authorized by law, specifically the Digital Millennium Copyright Act (DMCA), which includes provisions on Digital Rights Management (DRM) technologies that limits the consumer’s ability to use the product. If Alice buys an e-book, DRM technologies and legal provisions may limit her ability to print and copy-paste, and may impose time-limited access to her book.

Unfortunately, consumers do not seem to understand these limitations of the digital market. Perzanowski explained that 48% of online consumers think once they purchased an e-book by clicking on the “buy now” button, they are able to lend it to someone else and 86% think they actually own the book and can use it in the device of their choice. Perzanowski believes that consumers should be better informed so that they can have a better sense of autonomy on how to use the digital products they buy. For this reason, he advocates for better education on digital licensing so that consumers can recalibrate their expectations.

The use of Creative Commons (“CC”) licensing for content on the Wikimedia projects helps address Perzanowski’s concerns regarding limited consumer rights with respect to digital works and consumer education on digital licensing. First, CC licensing allows the Wikimedia communities to enjoy broader rights for digital works, such as the ability to share content with a friend or to produce derivatives, that are not available under more typical digital licensing schemes. Second, Creative Commons’ and the Foundation’s approaches to licensing provide certainty to consumers and promote transparency in how they can license content. For example, Creative Commons provides summaries of CC licenses in plain English. Similarly, the Wikimedia Foundation clearly explains to its users and contributors their rights and responsibilities in the use of CC licensed Wikimedia content. The Foundation also publicly consults with the Wikimedia communities on these licensing terms; it recently closed a consultation with the communities on a proposed change from CC BY-SA 3.0 to CC BY-SA 4.0.

In today’s world, physical copies and analog services are more the exception than the rule. This, however, shouldn’t mean that digital copies and online services have to be offered with fewer rights for users compared to rights available under traditional licensing schemes. We support licensing schemes that allow users to retain the same rights that they would otherwise have in the offline world so that users have the power to edit, share, and remix content: the more we empower, the more knowledge expands and creativity grows.

We believe strongly in a world where knowledge can be freely shared. Our visits to Yale ISP allow us to remain engaged in discussions about internet-related laws and affirm the importance of licenses like Creative Commons for the future of digital rights.

Ana Maria Acosta, Legal Fellow
Wikimedia Foundation

by Ana Maria Acosta at January 06, 2017 08:06 PM

#100womenwiki: A global Wikipedia editathon

Photo by BBC/Henry Iddon, CC BY-SA 3.0.

Photo by BBC/Henry Iddon, CC BY-SA 3.0.

On 8 December 2016, Wikimedia communities around the world held a multi-lingual, multi-location editathon in partnership with the BBC to raise awareness of the gender gap on Wikipedia, improve coverage of women, and encourage women to edit.

In the UK, events took place at BBC sites in Cardiff, Glasgow, and Reading, in addition to the flagship event at Broadcasting House in London; while events took place around the world in cities like Cairo, Islamabad, Jerusalem, Kathmandu, Miami, Rio de Janeiro, Rome, Sao Paulo and Washington DC. Virtual editathons were organised by Wikimedia Bangladesh, and by WikimujeresWikimedia Argentina and Wikimedia México for the Spanish-language Wikipedia. Women in Red were a strategic partner for the whole project, facilitating international partnerships between the BBC and local Wikimedia communities, helping to identify content gaps and sources, and working incredibly hard behind the scenes to improve new articles that were created as part of the project.

The global editathon was the finale of the BBC’s 100 Women series in 2016 and attracted substantial radio, television, online, and print media coverage worldwide.

The events were attended by hundreds of participants, many of them women and first-time editors, with nearly a thousand articles about women created or improved during the day itself. Impressively, Women in Red volunteers contributed over 500 new biographies to Wikipedia, with nearly 3000 articles improved as part of the campaign. Participants edited in languages including Arabic, Dari, English, Hausa, Hindi, Pashto, Persian, Russian, Spanish, Thai, Turkish, Urdu and Vietnamese, and were encouraged to live tweet the event using the shared hashtag #100womenwiki.

The online impact of #100womenwiki was significant, however of equal importance was the media coverage generated by the partnership. The BBC has a global reach of more than 350 million people a week, so this was a unique opportunity to highlight the gender gap, to raise the profile of the global Wikimedia community, and to reach potential new editors and supporters. In the UK, I was interviewed by Radio 5Live and Radio 4’s prestigious Today programme, while my colleague Stuart Prior and I appeared on the BBC World Service’s Science in Action programme. Dr Alice White, Wikimedian-in-Residence at the Wellcome Library, was also interviewed by 5Live and Jimmy Wales came to Broadcasting House to be interviewed by BBC World News, BBC Outside Source and Facebook Live. The story was featured heavily on the BBC’s online news coverage on 8th December—with an article by Rosie Stephenson-Goodknight that you can read here–and the project was covered by the Guardian, the Independent, and Metro in the UK, and other print and online media across the world.

Jimmy Wales, founder of Wikipedia, being interviewed at BBC 100 Women. Photo by the BBC/Henry Iddon via Wikimedia UK, CC BY-SA 3.0.

Jimmy Wales, founder of Wikipedia, being interviewed at BBC 100 Women. Photo by the BBC/Henry Iddon via Wikimedia UK, CC BY-SA 3.0.

The partnership with the BBC would not have been possible without the vision and energy of Fiona Crack, Editor and Founder of BBC 100 Women. After the events I spoke to her about what had been achieved and she reflected on how the combined reach and audience of the BBC and Wikimedia inspired and engaged people interested in women’s representation online. She commented “It was a buzzing event here in London, but the satellite events from Kathmandu to Nairobi, Istanbul to Jakarta were the magic that made 100 Women and Wikimedia’s partnership so special”

Clearly a project like #100womenwiki, focused on a single day of events, could never be a panacea for the gender gap on Wikimedia. After all, this is a complex issue reflecting systemic bias and gender inequality both online and in the wider world. With more lead-in time and resources, the partnership could have been even more successful, involving more Wikimedians and engaging and supporting more new editors.

However, events and partnerships like these demonstrate that the gender gap is not an entirely intractable issue. Within the global Wikimedia community, there are a significant number of people who are motivated to create change and willing to give up their free time contributing to Wikipedia and the sister projects, organising events, training editors and activating other volunteers and contributors in order to achieve it.

As the Chief Executive of Wikimedia UK—committed to building an inclusive online community and ensuring that Wikipedia reflects our diverse society and is free from bias—this is inspiring, encouraging, and humbling.

Lucy Crompton-Reid, Chief Executive
Wikimedia UK

This post was originally published on Wikimedia UK’s blog; it was adapted and lightly edited for publication in the Wikimedia blog.

by Lucy Crompton-Reid at January 06, 2017 06:08 PM

Wikimedia UK

So You’ve Decided to Become a Wikipedia Editor…

 

Learning to edit Wikipedia - Image by Jwslubbock CC BY-SA 4.0
Learning to edit Wikipedia – Image by Jwslubbock CC BY-SA 4.0

The learning curve when you start editing Wikipedia and its sister projects can be steep, so to help you get started, we decided to compile some advice that will help you navigate the complexity of the Wikimedia projects.

Check out the Getting Started page for general advice and information about how Wikipedia works before you start editing. There are a lot of written and visual tutorials as well as links to policies and guidelines used on the site. A quick look at the main editorial policies of Wikipedia, known as the Five Pillars, is also worthwhile.

1) Identify a subject area you know about.

Usually people have a particular area that they know about or are interested in. Wikipedia has project pages where people with similar interests go to discuss writing. They’re a great place to see what subjects you can contribute to – they often have advice on what work needs to be done in their area: Directory of Wikiprojects.

For example, if you’re interested in increasing the number of articles about women on WIkipedia, look at the Women in Red project page.

2) Fight the desire to create a new article straight away.

There are lots of ways to contribute to Wikipedia, and creating a new article is a big step when you’re starting out. Instead, you could try:

  1. making copyedits (correcting mistakes);
  2. improving stubs (enlarging small articles) Here’s a Twitter bot that lists stubs for you.
  3. contributing to red link lists (a red link is a page that does not exist on WP yet)

3) Start with a reference.

Wikipedia is the best available version of the evidence about any subject, so if you have factual books at home, find a good fact and insert a reference on a page about that topic. Be careful, however; some subjects have higher referencing criteria, especially the medical pages, so if you’re not a specialist in a complex area like medicine, start with a simpler subject area.

Finding reliable sources can be difficult, so here is a page with tips on how to identify these. You can also check out the Wikipedia list of Open Access journals and the Directory of Open Access Journals with reliable research that can you can reference.

4) Upload some photos to Commons.

As well as Wikipedia, one of the most important Wikimedia projects is Wikimedia Commons. If you’re more of a visual content creator than a writer, your photos might be useful to illustrate articles on Wikipedia.

Uploading to Commons means you agree that others can use your content for free without asking you as long as they give you credit as the author of the work. This agreement is called an Open License or a Creative Commons license, which go by odd names like CC BY-SA 4.0.

There are monthly photo competitions: current challenges are on drone photography, rail transport and home appliances.

You can also use the WikiShootMe tool to see what Wikipedia articles and Wikidata items are geolocated near your present location. Why not take images of some of the places listed and add the photos to their pages and data items?

5) Try to identify content gaps.

The English Wikipedia now has around 5.3 million articles, but the type of content skews towards the interests of the groups of people who are more likely to edit it. There’s lots of articles on Pokemon and WWE wrestling, but less about ethnic minorities, important women, non-European history and culture, and many other topics.

There is a tool that you can use to search for content gaps by comparing one Wikipedia to another to see which articles exist in, for example, Spanish, but not in English. You can try it out here.

6) Talk to other people in the community for advice.

Wikipedia has a help section with advice on how to get started, including a messageboard for asking questions and a help chatroom. There are also Facebook groups and IRC channels if you’re that cool.

If you’re one of those kinds of people who enjoys interacting with actual human beings in real life as well as online, there are social meetups for the Wikimedia community every month in London, Oxford and Cambridge, and periodically in Manchester and Edinburgh. There are also lots of events you can come to about specific subjects, many of which are hosted by our Wikimedians-in-Residence

_ _ _ _ _ _ _ _ _ _

A lot of people use Wikipedia but never edit it, and consequently never think about how much effort goes into creating it. Participating in the creation of knowledge yourself is a really instructive way to discover how knowledge is created and structured, and the issues we face in producing accurate and impartial knowledge.

If you speak another language, you can practice by translating articles from English into a target language, and at the same time help people to educate themselves for free in another part of the world.

The world can feel disempowering sometimes, but if you help to create a good article or upload a good photograph, it could be seen by hundreds of thousands of people, and you could make a difference to someone’s education, or government policy, or the visibility of minority cultures.

So if you’ve decided to become more involved in Wikipedia or its sister projects this year, thank you! Wikimedia UK is here to support you, so don’t hesitate to get in touch and ask for advice. Wikipedia has always been, and will continue to be a work in progress, and we think that provides exciting opportunities to help create a world in which every single human being can freely share in the sum of all knowledge.

by John Lubbock at January 06, 2017 04:39 PM

Erik Zachte

Wiki Loves Monuments 2016

In 2016 Wiki Loves Monuments (WLM) has been a top ranking project community initiative in terms of attention raised.

Here are further stats on that contest. The charts follow the layout used in this blog in earlier years, but the data have now been collected from another WLM stats tools wlm-stats. For added depth see also this Wikimedia blog post.

Participating countries in 2016 WLM contest

Map of countries participating in Wiki Loves Monuments 2016


Some charts are about image uploads.
One is about image uploaders, also known as contributors.

Countries

With 44 participating countries, 9 more than in 2015, the 2016 contest ranks second after 2013, when 53 countries participated. (See first table). 8 countries participated for the first time: Bangladesh, Georgia, Greece, Malta, Morocco, Nigeria, Peru and South Korea.

In those 7 years since WLM started 7 countries participated 6 times: Belgium, France, Germany, Norway, Russia, Spain, Sweden.

The contest ran in different countries during different periods (mostly because different calendars are in use, and the aim is to run the contest for a full calendar month).

List of countries that participated, per year

Participants per year to Wiki Loves Monuments contest (click to zoom)


Uploads

 

The 2016 in total 277,406 images were uploaded, which is 20% more than in 2015.

WLM_uploads_per_year_fixed


In 2016 Germany contributed most images: 38,809

wlm_uploads_by_country_2016


wlm_uploads_by_country_cumulative

wlm_uploads_by_country_cumulative_2010_2016


Contributors

In 2016 India and United States excelled in number of uploaders: 1784 vs 1783. As the measured numbers fluctuate a bit over time (there is always ongoing vetting), I suggest we call this an ex aequo first place.

wlm_contributors_by_country_2016
wlm_uploaders_by_country_year_by_year_2010_2016_top_10


Edit activity on Commons

Two Wikistats diagrams: every year the Wiki Loves Monuments contest brings peak activity on Commons. The second peak earlier in the year, mostly since 2014, is result of the Wiki Loves Earth contest.

Charts also available on Wikimedia CommonsPlotEditorsCOMMONS_updated


In 2016 the September peak (WLM) in uploads is again much more visible than the June peak (WLE). See also: this Wiki-loves yearly results page.

PlotUploadsCOMMONSupdated

by Erik at January 06, 2017 01:56 PM

January 05, 2017

Content Translation Update

January 6 CX Update: Fixes for page loading and template editor

Hello, and welcome to the first CX Update post of 2017!

We just deployed two significant fixes:

  • Many users complained recently that they cannot load a translation in progress that was auto-saved. This was happening when translating from languages that aren’t written in the Latin alphabet, such as Russian or Chinese. The data was not lost—it was correctly saved internally, but a software error prevented its proper loading. This is now supposed to be fixed, although some more work may be needed to make it more stable. If you experienced this issue, please try loading your article now. If it still doesn’t work, please report it. We apologize about this inconvenience. (bug report)
  • The template editor was remaining open when moving to the next section. This was confusing because some people didn’t realize that it’s supposed to be closed to actually save the data entered in the fields. It will now automatically close when moving to edit the next section. (bug report)

by aharoni at January 05, 2017 10:21 PM

Wikimedia Foundation

Coming soon: A global community survey to learn how to best support Wikimedians

Photo by Christopher Allison/Department of Defense, public domain/CC0.

Photo by Christopher Allison/Department of Defense, public domain/CC0.

How is the Wikimedia community’s health? Where do volunteer developers prefer to collaborate? What do Wikimedians think about partnerships? What is the most important workflow for editors?

These are only a few of the questions that the Wikimedia Foundation is asking in a new global community survey for Wikimedians. The opinions gathered from this survey will directly affect how the Foundation supports Wikimedia communities. The new survey, called Community Engagement Insights, is part of the Foundation’s annual plan. It was developed with input from 13 different teams at the Wikimedia Foundation and tested with Wikimedia volunteers.

Take the survey to influence decisions at the Foundation!

Surveys are important because they help us to design solutions, keeping community members in the heart of every project. In 2014, for example, we had many questions about harassment: How do our users experience harassment? What kinds of harassment are most common on which projects? How well Foundation staff were able to address harassment issues? The harassment survey was not a solution; it was the beginning of a conversation that has led to many actions.

Since the survey was completed, the movement has continued to take direct actions towards addressing the issue, from working on better detection (called Detox), to facilitating workshops and creating training modules. With the information from the survey, we were able to better understand the issues our communities and to begin taking action. The harassment survey will continue to inform future approaches to the problem.

Other surveys we have done recently include Reimagining Grants in mid 2015, which we used to help improve the support the Foundation offers through grants; an executive director survey in early 2016, which the Foundation used to help inform our search for the Foundation’s new executive director; the Tool Labs survey, which we used to identify needs in in the labs space and will help in planning for 2017-2018 year. These are only some examples of how asking the right thing at the right time can have an impact in our movement.

Wikimedia is a global movement of people, that goes beyond the online projects. Communities of volunteers are formed by citizens of the world that help create free access to knowledge. As a movement, we discuss and work on advocacy, mentorship, leadership, technology, collaboration and internationalization, among many other themes. The Foundation needs to make decisions and fund programs that will support Wikimedia communities in those specific areas. The Foundation also needs to learn and improve these programs. Are they being effective? Are they having the intended impact? Are these the right programs right now? The best way to answer these questions is to use tools that help us listen to communities. These not only include wiki pages, mailing lists, and conferences, but also consultations, interviews and surveys.

Community Engagement Insights is a new tool to help the Foundation listen to Wikimedia communities. This project seeks to improve surveys so the Foundation can try to hear from many voices equally and to make sure we are supporting and growing with the communities we directly assist. In other words, everyone who takes this survey can have a voice in deciding how the Foundation will improve their support. The data we collect will inform and guide the work of several teams at the Foundation.

Community Engagement Insights is not just a new survey; it is a new process that will support systematic surveys for the Wikimedia Foundation. The process would be highly collaborative and includes designing questions and capturing data from various audiences that can help support decisionmaking for the Foundation. For this year, the survey will reach out to very active editors, active editors, affiliates, program leaders, and volunteer developers. In future years, the audiences might expand. We hope the data will also be useful for the broader movement. We are working to make the datasets accessible to anyone while protecting the privacy of users.

The collaborative aspect of this survey was specifically designed to reduce the number of times the Foundation polls communities on any given topic.

Photo by Sebastiaan ter Burg, CC BY 2.0.

Photo by Sebastiaan ter Burg, CC BY 2.0.

Working towards actionable data

When we asked teams to come up with questions, we made sure that we understood their goals for each question and we asked how they intend to use the information they get for each prompt. If this information was unclear, we worked with the team to make sure there is a clear line of sight between the question and further actions.

Thirteen teams at the Foundation want to hear from Wikimedia communities. Here are some examples of the data they need: The Community Engagement Department wants to know about community health, because we need to learn more about how well we are collaborating, how well we support each other, and how engaged our volunteer community feels. The Technical Collaboration team wants to know about volunteer developer preferences in programming languages and collaboration spaces because we have a community of developers for whom we need to improve our support. The Global Reach team is working on expanding and changing their scope and they need to know what concerns communities have about different types of partnerships, including Wikipedia Zero, before making decisions about the types of partnerships they will pursue. The Editing Department needs to know more about which aspects of editing are more important for volunteers, so that the Foundation’s software developers can better prioritize their work improving the editing experience.

We are trying to get better at listening to communities, and surveys are one of the best ways we can do it for such a huge and diverse community. We also hope that having this process can help to reduce survey fatigue in the movement. As the project evolves, we hope it will begin to incorporate other surveys towards this end, and also to improve the quality of the data collected.

When you see the survey in your talk page, email, mailing list, or social media, remember that your opinions will directly affect the Foundation’s work.

Find out more information about Community Engagement Insights in our Frequently Asked Questions. You can learn how we are doing the sampling process, what kinds of questions we are asking, and which teams are participating.

Edward Galvez, Survey specialist, Learning and Evaluation team
María Cruz, Communications and outreach coordinator, Learning and Evaluation team

 

by Edward Galvez and María Cruz at January 05, 2017 08:06 PM

Trump, Prince, and Queen Elizabeth: 2016’s most-read Wikipedia articles

Photo by Michael Candelori, CC BY 2.0.

Photo by Michael Candelori, CC BY 2.0.

In 2016, people around the world turned to Wikipedia for facts about all kinds of things, but especially celebrities who died, television shows, and Donald Trump, who commanded unprecedented attention in both the media and on the English-language Wikipedia.

In total, there were over 76 million views on the English-language article about the US president-elect, making it 2016’s most-viewed Wikipedia article. That total is nearly three times more than 2015’s first-place article, which had just under 28 million.

A number of these views were concentrated around major events in the US presidential campaign: in the days after Trump’s multiple victories on “Super Tuesday,” 11 million pageviews were recorded on the article, and another 11 million came in the four days after the election.

Wikipedia editors were certainly diligently keeping up with the fast-changing news about him—the article about Trump was the second-most revised article of the year on the English Wikipedia.

Moreover, the Trump phenomenon extended far beyond the English language. More than five million pageviews were recorded on his article from readers in the Spanish, German, Russian; four million in the French; and three million in the Italian and Japanese languages.

A large percentage of these views came on November 9 and 10, the two days after Trump’s victory in the election. On the Russian Wikipedia, for instance, 36.7% of the entire year’s views came in those days. The number of people coming to read the article contributed to noticeable pageview jumps on the entire Spanish and Russian Wikipedias for November 9.

Back on the English Wikipedia, US politics occupied three other spots in the top ten. The main article on last year’s US election was fourth, while Trump’s wife Melania and Trump’s opponent Hillary Clinton garnered almost 19 and 18 million views (respectively).

Those numbers make them the highest-ranked women to ever appear in a Wikipedia most-viewed list, surpassing mixed martial artist Ronda Rousey‘s 12 million in 2015.1

Photo by penner, CC BY-SA 3.0.

Photo by penner, CC BY-SA 3.0.

Politics, however, was far from the only story of 2016. The main article about deaths in 2016 was the second most-viewed article with almost 36 million views, a total that would have put it in first place in any of the last four years.2

Prince and David Bowie, both musicians of rare talent and influence who died last year, were the third- and seventh-most viewed articles of the year. Nearly 13 of Bowie’s 19 million came shortly after his death in January. Similarly, 17 of the 22 million views on Prince’s article came in the days after his unexpected death in April 2016. At at least one point, there were so many viewers and editors on Prince’s article—at its peak, an average of 810 views per second—that some were redirected to an error page.

Boxer Muhammad Ali and actor Carrie Fisher also appear in the top 25. The large majority of views to Fisher’s article came almost entirely in the last week of the year, during which she was hospitalized and passed away.

Cinema and film, the theme of 2015, still makes up the bulk of the list—but last year had a notable bias towards superheroes. The headliner was Suicide Squad, whose 19.4 million pageviews was good enough for fifth. Trailers for the superhero ensemble blockbuster drummed up much interest for the film, leading to several view spikes during the year, but it faced “overwhelmingly negative” reviews upon its release. Franchise installments Captain America: Civil War, Batman v. Superman: Dawn of Justice, and Deadpool all also appear in the top 12. And away from superheroes, the yearly list of Bollywood films—a perennial favorite—was viewed 19.2 million times.

Perhaps surprisingly, two 2015 films appear in the 2016’s top 25. The Revenant, at #19, was released in the United States only days before the end of 2015. It got many views in 2016 after winning several awards, including star Leonardo diCaprio‘s first Oscar. Star Wars: The Force Awakens was powered to #22 by the tail end of its popularity. It was #3 on 2015’s list, and the article was viewed over 14 million times both in 2016 and in December 2015 alone.

Photo by NASA/Bill Ingalls, public domain/CC0.

Photo by NASA/Bill Ingalls, public domain/CC0.

People also jumped on Wikipedia to learn more about the television shows they were watching. Game of Thrones, for example, appears twice on the list (#18 and #22). But real-life individuals portrayed on television shows had an even more serious impact on the most-viewed list.

Streaming media company Netflix, for instance, is a strong suspect for two entries. The Crown, their biopic television series about the coronation and early reign of Queen Elizabeth II, likely helped boost the monarch to #13 on the list. Coming up shortly behind her was Narcos‘ main character Pablo Escobar (#16), the Colombian drug lord who was once the most wealthy criminal in the world.

FX, the American television channel, might have joined with The People v. O. J. Simpson: American Crime, as our list concludes at #25 with former American football player O. J. Simpson. The show, which aired in February and March 2016, is based on Simpson’s murder trial and eventual acquittal on all charges.

All three of these articles had some of the highest percentage of mobile views in the top 25, topping out with Simpson at 75.36%. This suggests a sort of second screen effect, where readers grabbed their mobile phones, searched Wikipedia, and educated themselves about the real-life equivalents to characters in front of them.

“There is certainly a correlation in article popularity for a new popular television show or movie and its primary actors,” says Wikipedia editor Milowent, one of two writers who examine the weekly popularity of Wikipedia articles in the Top 25 Report. “Articles about key figures in television shows, like Elizabeth II or Pablo Escobar, have been consistently popular. The correlation only grows for movies: Rogue One is currently near the top of the weekly charts, for example, helping boost lead actress Felicity Jones into the Top 25 in previous weeks.”

The top 25 articles follow below. You can see the top 5000 over on Wikipedia;3 our grateful thanks go to researcher Andrew West for collating the data.

most-viewed-en-wp-articles-of-2016

  1. Donald Trump (75,965,727)
  2. Deaths in 2016 (35,911,398)2
  3. Prince (musician) (22,793,889)
  4. United States presidential election, 2016 (22,063,171)
  5. Suicide Squad (film) (19,435,260)
  6. List of Bollywood films of 2016 (19,285,100)
  7. David Bowie (19,039,110)
  8. Melania Trump (18,946,792)
  9. Captain America: Civil War (18,693,046)
  10. Batman v Superman: Dawn of Justice (18,548,575)
  11. Hillary Clinton  (17,801,991)
  12. Deadpool (film) (16,917,412)
  13. Elizabeth II (16,815,631)
  14. United States (16,502,083)
  15. Muhammad Ali (16,303,934)
  16. Pablo Escobar (16,210,514)
  17. Barack Obama (15,994,091)
  18. Game of Thrones (15,726,657)
  19. The Revenant (2015 film) (15,077,213)
  20. UEFA Euro 2016 (14,488,759)
  21. Star Wars: The Force Awakens (14,168,904)
  22. Game of Thrones (season 6) (14,111,811)
  23. 2016 Summer Olympics (14,026,668)
  24. Carrie Fisher (13,923,993)
  25. O. J. Simpson (13,795,907)

Ed Erhart, Editorial Associate
Wikimedia Foundation

You can see a list of 2016’s most-edited English Wikipedia articles and previous most-viewed lists from 2015 (1, 2), 2014, and 2013. Most-viewed English Wikipedia articles of each week are available through Wikipedia’s Top 25 Report.

Footnotes

  1. Pageview counts from before 2015 are not directly comparable, as mobile readers were not counted until October 2014.
  2. Wikipedians chronicle the deaths by month, so the page now redirects to a “list of lists of deaths.” This year’s list has already been started at deaths in 2017.
  3. The top 5000 include the percentage of mobile views for screening purposes—those with less than 5% or more than 95% can be safely discounted, as a significant amount of the pageviews will have stemmed spam, botnets, or other errors. We have also removed Earth, which would have come in at #20 but with only 9% mobile views, on the recommendation of the top 25 team.

by Ed Erhart at January 05, 2017 03:54 PM

Shyamal

Tracing some ornithological roots

The years 1883-1885 were tumultuous in the history of zoology in India. A group called the Simla Naturalists' Society was formed in the summer of 1885. The founding President of the Simla group was, oddly enough, Courtenay Ilbert - who some might remember for the Ilbert Bill which allowed Indian magistrates to make judgements on British subjects. Another member of this Simla group was Henry Collett who wrote a Flora of the Simla region (Flora Simlensis). This Society vanished without much of a trace. A slightly more stable organization was begun in 1883, the Bombay Natural History Society. The creation of these organizations was precipitated by the emergence of a gaping hole. A vacuum was created with the end of an India-wide correspondence network of naturalists that was fostered by a one-man-force - that of A. O. Hume. The ornithological chapter of Hume's life begins and ends in Shimla. Hume's serious ornithology began around 1870 and he gave it all up in 1883, after the loss of years of carefully prepared manuscripts for a magnum opus on Indian ornithology, damage to his specimen collections and a sudden immersion into Theosophy which also led him to abjure the killing of animals, taking to vegetarianism and subsequently to take up the cause of Indian nationalism. The founders of the BNHS included Eha (E. H. Aitken was also a Hume/Stray Feathers correspondent), J.C. Anderson (who was a Simla naturalist) and Phipson (who was from a wine merchant family with a strong presence in Simla). One of the two Indian founding members, Dr Atmaram Pandurang, was the father-in-law of Hume's correspondent Harold Littledale, a college principal at Baroda.

Shimla then was where Hume rose in his career (as Secretary of State, before falling) allowing him to work on his hobby project of Indian ornithology by bringing together a large specimen collection and conducting the publication of Stray Feathers. Through readings, I had a constructed a fairytale picture of the surroundings that he lived in. Richard Bowdler Sharpe, a curator at the British Museum who came to Shimla in 1885 wrote (his description  is well worth reading in full):
... Mr. Hume who lives in a most picturesque situation high up on Jakko, the house being about 7800 feet above the level of the sea. From my bedroom window I had a fine view of the snowy range. ... at last I stood in the celebrated museum and gazed at the dozens upon dozens of tin cases which filled the room ... quite three times as large as our meeting-room at the Zoological Society, and, of course, much more lofty. Throughout this large room went three rows of table-cases with glass tops, in which were arranged a series of the birds of India sufficient for the identification of each species, while underneath these table-cases were enormous cabinets made of tin, with trays inside, containing series of the birds represented in the table-cases above. All the specimens were carefully done up in brown-paper cases, each labelled outside with full particulars of the specimen within. Fancy the labour this represents with 60,000 specimens! The tin cabinets were all of materials of the best quality, specially ordered from England, and put together by the best Calcutta workmen. At each end of the room were racks reaching up to the ceiling, and containing immense tin cases full of birds. As one of these racks had to be taken down during the repairs of the north end of the museum, the entire space between the table-cases was taken up by the tin cases formerly housed in it, so that there was literally no space to walk between the rows. On the western side of the museum was the library, reached by a descent of three stops—a cheerful room, furnished with large tables, and containing, besides the egg-cabinets, a well-chosen set of working volumes. ... In a few minutes an immense series of specimens could be spread out on the tables, while all the books were at hand for immediate reference. ... we went below into the basement, which consisted of eight great rooms, six of them full, from floor to ceilings of cases of birds, while at the back of the house two large verandahs were piled high with cases full of large birds, such as Pelicans, Cranes, Vultures, &c.
I was certainly not hoping to find Hume's home as described but the situation turned out to be a lot worse. The first thing I did was to contact Professor Sriram Mehrotra, a senior historian who has published on the origins of the Indian National Congress. Prof. Mehrotra explained that Rothney Castle had long been altered with only the front facade retained along with the wood-framed conservatories. He said I could go and ask the caretaker for permission to see the grounds. He was sorry that he could not accompany me as it was physically demanding and he said that "the place moved him to tears." Professor Mehrotra also told me about how he had decided to live in Shimla simply because of his interest in Hume! I left him and walked to Christ Church and took the left branch going up to Jakhoo with some hopes. I met the caretaker of Rothney Castle in the garden where she was walking her dogs on a flat lawn, probably the same garden at the end of which there once had been a star-shaped flower bed, scene of the infamous brooch incident with Madame Blavatsky (see the theosophy section in Hume's biography on Wikipedia). It was a bit of a disappointment however as the caretaker informed me that I could not see the grounds unless the owner who lived in Delhi permitted it. Rothney Castle has changed hands so many times that it probably has nothing to match with what Bowdler-Sharpe saw and the grounds may very soon be entirely unrecognizable but for the name plaque at the entrance. Another patch of land in front of Rothney Castle was being prepared for what might become a multi-storeyed building. A botanist friend had shown me a 19th century painting of Shimla made by Constance Frederica Gordon-Cumming. In her painting, the only building visible on Jakko Hill behind Christ Church is Rothney Castle. The vegetation on Shimla has definitely become denser with trees blocking the views.
 
So there ended my hopes of adding good views (free-licensed images are still misunderstood in India) of Rothney Castle to the Wikipedia article on Hume. I did however get a couple of photographs from the roadside. In 2014, I managed to visit the South London Botanical Institute which was the last of Hume's enterprises. This visit enabled the addition a few pictures of his herbarium collections as well as an illustration of his bookplate which carries his personal motto.

Clearly Shimla empowered Hume, provided a stimulating environment which included several local collaborators. Who were his local collaborators in Shimla? I have only recently discovered (and notes with references are now added to the Wikipedia entry for R. C. Tytler) that Robert (of Tytler's warbler fame - although named by W E Brooks) and Harriet Tytler (of Mt. Harriet fame) had established a kind of natural history museum at Bonnie Moon in Shimla with  Lord Mayo's support. The museum closed down after Robert's death in 1872, and it is said that Harriet offered the bird specimens to the government. It would appear that at least some part of this collection went to Hume. It is said that the collection was packed away in boxes around 1873. The collection later came into possession of Mr B. Bevan-Petman who apparently passed it on to the Lahore Central Museum in 1917.

Hume's idea of mapping rainfall
to examine patterns of avian distribution
It was under Lord Mayo that Hume rose in the government hierarchy. Hume was not averse to utilizing his power as Secretary of State to further his interests in birds. He organized the Lakshadweep survey with the assistance of the navy ostensibly to examine sites for a lighthouse. He made use of government machinery in the fisheries department (Francis Day) to help his Sind survey. He used the newly formed meteorological division of his own agricultural department to generate rainfall maps for use in Stray Feathers. He was probably the first to note the connection between rainfall and bird distributions, something that only Sharpe saw any special merit in. Perhaps placing specimens on those large tables described by Sharpe allowed Hume to see geographic trends.

Hume was also able to appreciate geology (in his youth he had studied with Mantell ), earth history and avian evolution. Hume had several geologists contributing to ornithology including Stoliczka and Ball. One wonders if he took an interest in paleontology given his proximity to the Shiwalik ranges. Hume invited Richard Lydekker to publish a major note on avian osteology for the benefit of amateur ornithologists. Hume also had enough time to speculate on matters of avian biology. A couple of years ago I came across this bit that Hume wrote in the first of his Nests and Eggs volumes (published post-ornith-humously in 1889):

Nests and Eggs of Indian birds. Vol 1. p. 199
I wrote immediately to Tim Birkhead, the expert on evolutionary aspects of bird reproduction and someone with an excellent view of ornithological history (his Ten Thousand Birds is a must read for anyone interested in the subject) and he agreed that Hume had been an early and insightful observer to have suggested female sperm storage.

Shimla life was clearly a lot of hob-nobbing and people like Lord Mayo were spending huge amounts of time and money just hosting parties. Turns out that Lord Mayo even went to Paris to recruit a chef and brought in an Italian,  Federico Peliti. (His great-grandson has a nice website!) Unlike Hume, Peliti rose in fame after Lord Mayo's death by setting up a cafe which became the heart of Shimla's social life and gossip. Lady Lytton (Lord Lytton was the one who demoted Hume!) recorded that Simla folk "...foregathered four days a week for prayer meetings, and the rest of the time was spent in writing poisonous official notes about each other." Another observer recorded that "in Simla you could not hear your own voice for  the grinding of axes. But in 1884 the grinders were few. In the course of my service I saw much of Simla society,  and I think it would compare most favourably with any other town of English-speaking people of the same size. It was bright and gay. We all lived, so to speak, in glass houses. The little bungalows perched on the mountainside wherever there was a ledge, with their winding paths under the pine trees, leading to our only road, the Mall." (Lawrence, Sir Walter Roper (1928) The India We Served.)

A view from Peliti's (1922).
Peliti's other contribution was in photography and it seems like he worked with Felice Beato who also influenced Harriet Tytler and her photography. I asked a couple of Shimla folks about the historic location of Peliti's cafe and they said it had become the Grand Hotel (now a government guest house). I subsequently found that Peliti did indeed start Peliti's Grand Hotel, which was destroyed in a fire in 1922, but the centre of Shimla's social life, his cafe, was actually next to the Combermere Bridge (it ran over a water storage tank and is today the location of the lift that runs between the Mall and the Cart Road). A photograph taken from "Peliti's" clearly lends support for this location as do descriptions in Thacker's New Guide to Simla (1925). A poem celebrating Peliti's was published in Punch magazine in 1919. Rudyard Kipling was a fan of Peliti's but Hume was no fan of Kipling (Kipling seems to have held a spiteful view of liberals - "Pagett MP" has been identified by some as being based on W.S.Caine, a friend of Hume; Hume for his part had a lifelong disdain for journalists. Kipling's boss, E.K. Robinson started the British Naturalists' Association while E.K.R.'s brother Philip probably influenced Eha.

While Hume most likely stayed well away from Peliti's, we see that a kind of naturalists social network existed within the government. About Lord Mayo we read: 
Lord Mayo and the Natural History of India - His Excellency Lord Mayo, the Viceroy of India, has been making a very valuable collection of natural historical objects, illustrative of the fauna, ornithology, &c., of the Indian Empire. Some portion of these valuable acquisitions, principally birds and some insects, have been brought to England, and are now at 49 Wigmore Street, London, whence they will shortly be removed. - Pertshire Advertiser, 29 December 1870.
Another news report states:
The Early of Mayo's collection of Indian birds, &c.

Amids the cares of empire, the Earl of Mayo, the present ruler of India, has found time to form a valuable collection of objects illustrative of the natural history of the East, and especially of India. Some of these were brought over by the Countess when she visited England a short time since, and entrusted to the hands of Mr Edwin Ward, F.Z.S., for setting and arrangement, under the particular direction of the Countess herself. This portion, which consists chiefly of birds and insects, was to be seen yesterday at 49, Wigmore street, and, with the other objects accumulated in Mr Ward's establishment, presented a very striking picture. There are two library screens formed from the plumage of the grand argus pheasant- the head forward, the wing feathers extended in circular shape, those of the tail rising high above the rest. The peculiarities of the plumage hae been extremely well preserved. These, though surrounded by other birds of more brilliant covering, preserved in screen pattern also, are most noticeable, and have been much admired. There are likewise two drawing-room screens of smaller Indain birds (thrush size) and insects. They are contained in glass cases, with frames of imitation bamboo, gilt. These birds are of varied and bright colours, and some of them are very rare. The Countess, who returned to India last month, will no doubt,add to the collection when she next comes back to England, as both the Earl and herself appear to take a great interest in Illustrating the fauna and ornithology of India. The most noticeable object, however, in Mr. Ward's establishment is the representation of a fight between two tigers of great size. The gloss, grace, and spirit of the animals are very well preserved. The group is intended as a present to the Prince of Wales. It does not belong to the Mayo Collection. - The Northern Standard, January 7, 1871
And Hume's subsequent superior was Lord Northbrook about whom we read:
University and City Intelligence. - Lord Northbrook has presented to the University a valuable collection of skins of the game birds of India collected for him by Mr. A.O.Hume, C.B., a distinguished Indian ornithologist. Lord Northbrook, in a letter to Dr. Acland, assures him that the collection is very perfec, if not unique. A Decree was passed accepting the offer, and requesting the Vice-Chancellor to convey the thanks of the University to the donor. - Oxford Journal, 10 February 1877
Papilio mayo
Clearly Lord Mayo and his influence on naturalists in India is not sufficiently well understood. Perhaps that would explain the beautiful butterfly named after him shortly after his murder. It appears that Hume did not have this kind of hobby association with Lord Lytton, little wonder perhaps that he fared so badly!

Despite Hume's sharpness on many matters there were bits that come across as odd. In one article on the flight of birds he observes the soaring of crows and vultures behind his house as he sits in the morning looking towards Mahassu. He points out that these soaring birds would appear early on warm days and late on cold days but he misses the role of thermals and mixes physics with metaphysics, going for a kind of Grand Unification Theory:

And then claims that crows, like saints, sages and yogis are capable of "aethrobacy".
This naturally became a target of ridicule. We have already seen the comments of E.H. Hankin on this. Hankin wrote that if levitation was achieved by "living an absolutely pure life and intense religious concentration" the hill crow must be indulging in "irreligious sentiments when trying to descend to earth without  the help of gravity." Hankin despite his studies does not give enough credit for the forces of lift produced by thermals and his own observations were critiqued by Gilbert Walker, the brilliant mathematican who applied his mind to large scale weather patterns apart from conducting some amazing research on the dynamics of boomerangs. His boomerang research had begun even in his undergraduate years and had earned him the nickname of Boomerang Walker. On my visit to Shimla, I went for a long walk down the quiet road winding through dense woodland and beside streams to Annandale, the only large flat ground in Shimla where Sir Gilbert Walker conducted his weekend research on boomerangs. Walker's boomerang research mentions a collaboration with Oscar Eckenstein and there are some strange threads connecting Eckenstein, his collaborator Aleister Crowley and Hume's daughter Maria Jane Burnley who would later join the Hermetic Order of the Golden Dawn. But that is just speculation!
1872 Map showing Rothney Castle

The steep road just below Rothney Castle

Excavation for new constructions just below and across the road from Rothney Castle

The embankment collapsing below the guard hut

The lower entrance, concrete constructions replace the old building

The guard hut and home are probably the only heritage structures left


I got back from Annandale and then walked down to Phagli on the southern slope of Shimla to see the place where my paternal grandfather once lived. It is not a coincidence that Shimla and my name are derived from the local deity Shyamaladevi (a version of Kali).


The South London Botanical Institute

After returning to England, Hume took an interest in botany. He made herbarium collections and in 1910 he established the South London Botanical Institute and left money in his will for its upkeep. The SLBI is housed in a quiet residential area. Here are some pictures I took in 2014, most can be found on Wikipedia.


Dr Roy Vickery displaying some of Hume's herbarium specimens

Specially designed cases for storing the herbarium sheets.

The entrance to the South London Botanical Institute

A herbarium sheet from the Hume collection

 
Hume's bookplate with personal motto - Industria et Perseverentia

An ornate clock which apparently adorned Rothney Castle
A special cover released by Shimla postal circle in 2012

Further reading
 Postscript

 An antique book shop had a set of Hume's Nests and Eggs (Second edition) and it bore the signature of "R.W.D. Morgan" - it appears that there was a BNHS member of that name from Calcutta c. 1933. It is unclear if it is the same person as Rhodes Morgan, who was a Hume correspondent and forest officer in Wynaad/Malabar who helped William Ruxton Davison.
Update:  Henry Noltie of RBGE pointed out to me privately that this is cannot be the forester Rhodes Morgan who died in 1919! - September, 2016.

    by Shyamal L. (noreply@blogger.com) at January 05, 2017 12:36 PM

    Weekly OSM

    weeklyOSM 337

    12/27/2016-01/02/2017

    Logo

    A motorway island discovered with JOSM 1 | © Andrey Golovin

    About us

    • We wish all our readers a peaceful and a happy new year 2017.The German team produced a review of the year’s events with the most important links in our category maps. Unfortunately, the team has had no time to translate it into other languages. But the translations links should be fine in most of the articles. 😉
    • Due to shortage of manpower we have had to drop our Italian version of weeklyOSM. So we will publish in 8 languages in the future. A special thanks to the two remaining editors sabas88 and sbiribizio who tried together to keep the Italian edition alive. For an edition – even if it is just translations – the weeklyOSM requires minimum three editors. We thank them for their willingness to continue as weeklyOSM’s reporters for the Italian community.

    Mapping

    • Brian Prangle has written a guide to help with the collection and mapping of fire hydrants in UK. This mapping effort was prompted by the declaration of fire hydrant locations in the West Midlands of England as “secret” (despite each one being labelled with a highly visible, often bright yellow sign)!
    • Harald Hartmann is looking for the source of the tag maxspeed=DE:rural. (automatic translation)
    • TagaSanPedroAko writes in detail about his latest mapping activity, including power line mapping and adding points of interest, in Batangas City – a place he has been regularly mapping.
    • Zverik analyses the change count of POIs by editor type.
    • Toc-rox reports, that often tracks do not have a tracktype and how this is handled in the “Freizeitkarte” . A disussion about good default values has started. (automatic translation)

    Community

    • Tom analyses the building coverage of OpenStreetMap in Austria. (automatic translation)
    • HackerNews user recommends OSM in response to a discussion about Google Maps lite mode, particularly with respect to up-to-dateness regarding roads and speed of loading in comparison with Google Maps.
    • Stumbled upon OpenStreetMap while playing Pokemon Go? Here are some tips to get started with contributing to OSM.
    • On Talk-GB, Brian Prangle shares a list of tentative projects in consideration for this year’s first ‘Quarterly Project’.
    • Andygol shows how to test for a connected net of streets that are still connected at other zoom levels.
    • User escada interviews Philippe Verdy as the Mapper of the Month.

    Imports

    • User ryebread brings to the community’s attention his efforts to import data from the Southeast Michigan Council of Governments (SEMCOG) for Detroit.

    OpenStreetMap Foundation

    • OSMF provides a recap of 2016, highlighting some of the interesting work done during the past year.

    Events

    Humanitarian OSM

    • In Tunis young people are looking for a better image of a poor district. The project uses OSM and is financially supported by Switzerland.
    • Tyler Radford announced the successful completion of the fund critical community mapping projects.

    Maps

    • Greg reported the results of the last quarterly project. The aim was to use UK Food Hygiene Rating System data from the UK Food Standards Agency to improve the density of POIs, addresses and postcodes in town centres. Statistics and tools can still be used.
    • Christian Quest announced the renewed French map style and moved to a new server. See the new map and what was changed.
    • Sven Geggus improved the “German-Map-Style”. Very nice feature: The map shows names in two languages. Sven is looking for help.

    switch2OSM

    • The TAHUNA app beta version is available (automatic translation) from Google Play Store, adding Teasi navigation tools to your mobile device.

    Software

    • Maps.me has integrated an important feature since the new update: traffic information.
    • The Android app OSM for the dyslexic an OSM-based world atlas for dyslexic users. The development is part of the MyGEOSS project.
    • WordPress asks via tweet for beta testers on their upcoming OSM plugin version 3.8.

    Programming

    • Mike Fricker, the technical director for Unreal Engine 4 at Epic Games, has released a plugin for their popular game engine which can import OpenStreetMap data into their game editor.

    Releases

    Software Version Release date Comment
    Mapillary Android * 3.12 2016-12-20 App install fix, stopping background upload service when finished.
    Maps.me Android * var 2016-12-26 Travel data in 36 countries.
    Komoot iOS * 8.5.1 2016-12-27 See what month a highlight is most visited.
    Maps.me iOS * 7.0.4 2016-12-27 Travel data in 36 countries.
    Grass Gis 7.2.0 2016-12-28 More than 1,950 stability fixes and manual improvements, 50 new addons.
    JOSM 11425 2016-12-31 Many improvements, see release info.
    SQLite 3.16.0 2017-01-02 14 enhancements and two bugfixes.

    Provided by the OSM Software Watchlist.

    (*) unfree software. See: freesoftware.

    Did you know …

    • … the German alternative to Google Maps? Maps.metager.de (automatic translation) of the SUMA-EV and the Leibnitzuniversität Hannover is currently in the beta version and offers only Germany-wide search requests on OSM basis. However, it promises to be one of the safest search engines.

    Other “geo” things

    • In an article on L’Obs, the online edition of the Nouvel Observateur, it is once again about how Google adapts the national boundaries according to the states’ view.
    • The Verge reports about the launch of the service toilet-locator from Google and India’s Ministry of Urban Development.

    Upcoming Events

    Where What When Country
    Dortmund Stammtisch 01/08/2017 Germany
    Manila 【MapAm❤re】OSM Workshop Series 6/8, San Juan 01/09/2017 Philippines
    Rennes Réunion mensuelle 01/09/2017 France
    Passau Niederbayerntreffen 01/09/2017 Germany
    Lyon Rencontre mensuelle mappeurs 01/10/2017 France
    Nantes Rencontres mensuelles 01/10/2017 France
    Berlin 103. Berlin-Brandenburg Stammtisch 01/12/2017 Germany
    Ulloa 1er encuentro comunidad OSMCo 01/13/2017-01/15/2017 Colombia
    Kyoto 【西国街道シリーズ】長岡天満宮マッピングパーティ 01/14/2017 Japan
    Rennes Atelier de découverte 01/15/2017 France
    Lyon Mapathon Missing Maps Avancé pour Ouahigouya 01/16/2017 France
    Brussels Brussels Meetup 01/16/2017 Belgium
    Essen Stammtisch 01/16/2017 Germany
    Manila 【MapAm❤re】OSM Workshop Series 7/8, San Juan 01/16/2017 Philippines
    Cologne/Bonn Bonner Stammtisch 01/17/2017 Germany
    Scotland Edinburgh 01/17/2017 UK
    Osnabrück Stammtisch / OSM Treffen 01/18/2017 Germany
    Karlsruhe Stammtisch 01/18/2017 Germany
    Osaka もくもくマッピング! #02 01/18/2017 Japan
    Leoben Stammtisch Obersteiermark 01/19/2017 Austria
    Urspring Stammtisch Ulmer Alb 01/19/2017 Germany
    Tokyo 東京!街歩き!マッピングパーティ:第4回 根津神社 01/21/2017 Japan
    Manila 【MapAm❤re】OSM Workshop Series 8/8, San Juan 01/23/2017 Philippines
    Bremen Bremer Mappertreffen 01/23/2017 Germany
    Brussels FOSDEM 2017 02/04/2017-02/05/2017 Belgium

    Note: If you like to see your event here, please put it into the calendar. Only data which is there, will appear in weeklyOSM. Please check your event in our public calendar preview and correct it, where appropiate..

    This weeklyOSM was produced by Peda, Polyglot, Rogehm, SomeoneElse, SrrReal, TheFive, YoViajo, derFred, jinalfoflia, keithonearth, kreuzschnabel, seumas.

    by weeklyteam at January 05, 2017 11:48 AM

    Wiki Education Foundation

    Wiki Ed encourages geophysicists to teach with Wikipedia

    Last month, Outreach Manager Samantha Weald, Classroom Program Manager Helaine Blumenthal, Director of Programs LiAnna Davis, and I attended the American Geophysical Union’s annual meeting here in San Francisco. At the conference, we spoke to dozens of scientists who believe Wikipedia is a valuable website for them, their students, and the world. We’re excited to bring more geophysics, geology, and earth science students to Wikipedia in the coming years, helping us amplify the impact of this year’s Wikipedia Year of Science.

    In January 2016, we started the Year of Science as a year-long campaign to improve Wikipedia’s science coverage. After all, Wikipedia is the main source of scientific information for the general public. So if we’re interested in a scientifically literate populace, we need to make science accessible and available to those who don’t pursue it as a career.

    The idea was simple: students have all the tools they need to contribute to the public scholarship of science. They have access to rigorous research and scientific journals through the university library and they regularly meet with an expert in the field who explains important topics and concepts. Plus, students are still studying and learning about scientific topics as they develop their own expertise, so they’re less removed from communicating these ideas to non-experts than decades-long researchers. This key attribute makes students ideal science communicators, and we want to help students actively build those science communication skills in the classroom.

    In the Classroom Program, we provide the toolkit students need to become Wikipedians, or contributors to the encyclopedia. Over the course of the semester, students identify missing components in a Wikipedia article related to class, research the topic, and learn how to add well-sourced information to Wikipedia. We’ve worked with several earth science courses over the years, which is why we created a guide for students editing environmental science content.

    During the Year of Science, more than 6,000 science students have used our training materials to learn how Wikipedia works. Together, they added 4.93 million words about science to Wikipedia. At the AGU conference, we were proud to share these results with potential program participants, as they considered the value Wikipedia assignments can bring not only to their classroom, but also to the public. By inviting these instructors to join our program, we will build on the accomplishments our students made during 2016 to make science accessible to those outside of academia.

    If you teach in the earth sciences and are interested in learning more about increasing your students’ contributions to public scholarship about the earth and climate, email us at contact@wikiedu.org.

    by Jami Mathewson at January 05, 2017 12:02 AM