WordPress.com Social Reciprocity Visualization Challenge

Every day, millions of people connect with ideas, photos, and other content on WordPress.com. Here at Automattic, we take pride in enabling this interaction, and continually strive to make the WordPress.com platform better for users.

Our data science team examines these user interactions, and aims to develop our insights into user facing features and tools. With this challenge, we decided to open up some of our work and share with the community some of the questions we are excited about.

On our platform, and across the Web, the question of social reciprocity is one of the most interesting. How does platform design, user content, and social activity combine and affect user engagement?

We are running this visualization challenge on the Databits.io platform, where we’re inviting anyone who’s obsessed with data like we are to come up with some interesting visualizations of the following scenarios. We’re offering a $1,000 prize for the best one!

Two ideas we’d love to see explored are

  • User-to-user social reciprocity. The provided data is sufficiently rich to explore the dynamics of user-to-user social interactions. Are there compelling stories we can tell about how individual users react to other users’ actions on the platform, temporally? How does blog posting and the kind of blogging content enter the picture?
  • User-to-community social reciprocity. There are actions that users send to the broader WordPress community and also records of the community generating social interactions on the users’ blogs. On the scale of user to community interaction, are there patterns that can help understand social reciprocity? Does the interaction depend on blog posting? What are the temporal dynamics?

Read more about the data, the challenge, and the prizes being offered for the best visualizations over at Databits.io.

If you love data like we do, consider joining our team! We’re currently hiring Data Wranglers.

Photon WebP Image Support

We’re happy to announce that the Photon image service now offers seamless support for the WebP image format. This new feature provides size reductions of up to 34% for served images compared to a JPEG of an equivalent visual quality level. One thing to keep in mind, though: WebP isn’t currently supported by all browsers (see the WebP FAQ for more details).

JPEG vs WebP

Visually identical images in JPEG and WebP format with their respective sizes.

The trick in serving a format that isn’t universally supported is to auto-detect which browsers do support it, a process that can become clunky and unwieldy. We found a simple solution, however, in the Accept header, which is sent along with every image request. In this header the browser specifies whether it supports the WebP image format. Consequently, when our regionally distributed Photon CDN servers analyze the headers, the service is able to detect and serve the best image format for each browser request that comes in from its local cache.

We should note that by default, Photon is compressing the WebP images at a high quality setting. If you want to get the most out of this new feature, utilizing the quality parameter will yield the best results.

Additional Resources:

WordPress Developers: Test your i18n (internationalization) knowledge!

Alex Kirk lives in Austria and is a developer on the i18n (internationalization) team at Automattic. We’re looking for talented people wherever they live —why not join our team

i8n-logo

Whenever we write plugins or themes, there is one thing that needs a little extra attention and is quite frankly hard to get right: Translatable text.

Be it a button or some explanatory text, you generally will want to make that text be translatable to other languages, so that even more people can use your piece of software. While there is a very extensive guide available in the WordPress Handbook, we have created a fun way to brush up your knowledge on how to get things right: a quiz.

If you’re reading this post via a feed reader or an e-mail subscription, we encourage you to view the post on our developers blog to take the test (there are no winners or losers, this is meant to help you learn!), as it uses a little JavaScript to tell you whether an answer is right or wrong.

For each answer, we also provide an explanation, whether it’s right or wrong. So after clicking the answer that you think is right, make sure to click the other ones to explore what might be wrong about them.

So without further ado, take the quiz below!

You want to output the username in a sentence. Assume that the $username has been escaped using esc_html(). How do you do that?
<?php printf( __( 'Howdy, %s!' ), $username ); ?>
Good! Some languages may need to switch the location of the username to the front of this string. This code provides needed flexibility by including both the placeholder and the punctuation mark. Check the other answers though, there is an even improved answer.
<?php /* translators: %s is a username */ printf( __( 'Howdy, %s!' ), $username ); ?>
Awesome, the comment for translators is the cherry on the cake, as they cannot see variable names. Some languages may need to switch the location of the username to the front of this string. This code provides needed flexibility by including both the placeholder and the punctuation mark.
<?php printf( __( 'Howdy, %s' ), $username ); ?>!
This is almost correct. The punctuation mark should be included in the translatable string.
<?php echo __( 'Howdy' ) . ', ' . $username; ?>!
Translators may need to put the username first in other languages. That’s not possible with this code because it isn’t using a placeholder and a function that does substitution such as printf.
<?php _e( 'Howdy, %s!', $username ); ?>
The _e() function can only output text. It does not substitute variables.
<?php _e( "Howdy, $username!" ); ?>
Variables in a string are a no-no because the translated text is loaded by using the original English text which needs to be the same for all possible outputs.
You need to include a link in a sentence. How can you do that?
printf( __( 'Publish something using our <a href="%s">Post by Email</a> feature.'), 'http://support.wordpress.com/post-by-email/' );
Correct. Embed HTML in the string when it is necessary to keep the sentence structure intact for translators. Some examples would be href tags or bold/italics around a mid-sentence word.
_e( 'Publish something using our <a href="http://support.wordpress.com/post-by-email/">Post by Email</a> feature.' );
We don’t want to include URLs in the translation because we don’t want to expose them as translatable to translators. Also, if the URL is hardcoded within the string and then we ever change it, the entire string will become a new translation which will require re-translation.
printf( __( 'Publish something using our %s feature.' ), sprintf( '<a href="http://support.wordpress.com/post-by-email/">%s</a>', __( 'Post by Email' ) ) );
This code breaks the sentence up which causes a loss of full context during translation. We always try to keep full sentences/phrases together because having the whole string leads to much better translations.
Which of these is the correct way to use the single/plural _n() function?
printf( _n( '%d person has seen this post.', '%d people have seen this post.', $view_count ), $view_count );
Correct. Always use a placeholder in both singular and plural strings.
printf( _n( 'One person has seen this post.', '%d people have seen this post.', $view_count ), $view_count );
The hardcoded “One” in the singular string is problematic. We always want to use a placeholder in both singular and plural strings. Some languages (such as Russian) have multiple plurals which require the flexibility provided by using the placeholder in the singular string (#).
“So and so many people have seen this post” should be output like this:
printf( _n( '%d person has seen this post.', '%d people have seen this post.', $view_count ), $view_count );
Correct. We use the variable twice: 1) we need the number for the _n() function to determine the correct singular/plural text and 2) we need the number for the subsequent substitution in the printf. Also, it’s very important that the %d placeholder is used in the singular string (and not a hardcoded “1”) because some languages, such as Russian, have multiple plural forms. Those languages rely on that flexibility in the singular string.
printf( __( '%d people have seen this post.' ), $view_count );
For strings like this containing a numerical count, we want to use _n() instead because we always need to include the singular form of the string–even if the singular case should never happen. Why? Some languages, such as Russian, have multiple plural forms and they rely on flexibility provided by the singular string.
printf( _n( '%d person has seen this post.', '%d people have seen this post.' ), $view_count );
Almost. The _n() function also needs to know about the count value via its third parameter so it can determine the correct text.
printf( 1 == $view_count ? __( '%d person has seen this post.' ) : __( '%d people have seen this post.' ), $view_count );
Some languages have multiple plural forms–not just the typical singular/plural distinction–so this approach is problematic. We need to use _n() instead as it accounts for those multiple plural form complexities.
echo _n( 'One person has seen this post', "$view_count people have seen this post." );
Several things are amiss here. First, the hardcoded “One” needs to be a %d placeholder because some languages have multiple plural forms–not just the typical singular/plural distinction–and _n() with proper placeholdering handles that. The second issue is that $view_count needs to be a %d placeholder as well. Finally, all the above means that we need to switch the echo to a printf to use the placeholders and we’ll also want to add $view_count as a third argument to _n() as it expects a count value to determine which string to use.
How do you deal with outputting a variable in the context of a translation?
<h1><?php printf( __( 'Hello %s' ), esc_html( $world ) ); ?></h1>
Correct. Here PHP 1) swaps in the translated string which also contains the %s placeholder, 2) escapes the $world var safely, and then 3) substitutes the now escaped $world value into the placeholder spot. Exactly what we want.
One reminder, though: if you use this piece of code you need to be sure that you have verified your translations, so that your translation of Hello %s doesn’t include malicious code. If you don’t trust your translations, you should use a esc_html(sprintf()) construction instead of the printf.
<h1><?php printf( esc_html__( 'Hello %s' ), $world ); ?></h1>
This code is unsafe because it isn’t escaping $world at all. PHP runs esc_html__ first which swaps in the translated string (eg, "Hola %s") and then escapes it. Unfortunately, after that, printf swaps the value of $world into the placeholder which is unescaped. Danger, Will Robinson, danger!
<h1><?php echo esc_html__( sprintf( 'Hello %s' ), $world ) ); ?></h1>
We never want a sprintf inside a translation function. Translation files are generated by a cron job that parses (not execute!) PHP files looking for the translation functions sprintf isn’t resolved when that parsing happens which means this code will just be garbage translation data.
<h1><?php esc_html_e( 'Hello %s', $world ); ?></h1>
The second parameter of esc_html_e() is for a context value. We need printf here to do the variable substitution.
What’s the best practice to include formatted numbers in strings?
printf( _n( 'Today you already got %s view.', 'Today you already got %s views.', $view_count ), number_format_i18n( $view_count ) );
Correct. Use _n() for the possibly singular/plural string and use number_format_i18n() to actually format the number to local rules (for example some locales have a different thousand separator). We do indeed use %s here for the number because number_format_i18n() returns a formatted string.
$views = number_format( $view_count );
printf( _n( 'Today you already got %d view.', 'Today you already got %d views.' ), $views );
There are a few problems here. We want to be using number_format_i18n(). Also, number_format_i18n() produces strings, not numbers, so we need to use %s. Finally, in addition to printf, we need to give the count number to the _n() function so it knows which string variant to use.
_en_fmt( 'Today you already got %d view.', 'Today you already got %d views.', $views );
Arrowed! There isn’t a _en_fmt() function.
How to deal with multiple variables in a translated string?
printf( __( 'Posted on %1$s by %2$s.' ), $date, $username );
Almost correct. The placeholders are numbered so their values can be re-arranged if need be in translations. The remaining problem, though: translators don’t see the variable names, therefore they can only guess that the one variable is a date and the otherone is a username.
/* translators: %1$s is a date, %2$s is a username */
printf( __( 'Posted on %1$s by %2$s.' ), $date, $username );
Perfect. We make sure to number our placeholders so their values can be re-arranged if need be in translations. Also we give additional info to translators so that they can know which variable means what.
printf( __( 'Posted on %(date)s by %(username)s.' ), $date, $username );
Good thinking, but this syntax unfortunately is not available in PHP.
printf( __( 'Posted on %s by %s.' ), $date, $username );
We want to make sure we use numbered placeholders (ie, %1$s, %2$s, etc) whenever there is more than one placeholder because translators may need to re-arrange their locations in their translations.
Which of these is correct?
switch ( $type ) {
    case 'date':
        printf( __( 'Sorted by date' ) );
        break;
    case 'comments':
        printf( __( 'Sorted by comments' ) );
        break;
}
Correct. We want to give translators full sentences/phrases.
switch ( $type ) {
    case 'date':
        printf( __( 'Sorted by %s.' ), __( 'date' ) );
        break;
    case 'comments':
        printf( __( 'Sorted by %s.' ), __( 'comments' ) );
        break;
}
Unnecessarily breaking up sentences/phrases is a problem for translators. “Date” by itself may be translated differently from when it is used in a sentence, so we want to keep complete sentences/phrases together whenever possible.
$pattern = __( 'Sorted by %s.' );
switch ( $type ) {
    case 'date':
        printf( $pattern, __( 'date' ) );
        break;
    case 'comments':
        printf( $pattern, __( 'comments' ) );
        break;
}
This looks so efficient but unfortunately it’s wrong: essentially this is a concatenation of strings, which can’t be done in translations, because a generic translation of “date” might be wrong in the context of sorting. Or it would need to be in another grammatical case. Or other reasons. Short: don’t do that.
printf( __( 'Sorted by %s.' ), __( $type ) );
The code here won’t work because translation functions cannot be fed PHP variables. Translation files are generated by a cron job that parses (not execute!) PHP files looking for the translation functions. It doesn’t execute any of the PHP so the variable is unresolved which leads to garbage translation data (actually, the parsing just rejects it).

Lossy Image Compression with Photon

If you were watching closely, you may have noticed that we recently introduced the option for lossy JPEG compression with Photon. The new parameters are quality and strip. Quality is pretty straight forward — the image quality out of 100. Strip refers to meta data that can be stripped from an image — namely exif and color data. It accepts exif, color, or all for both.

For example: https://developer.files.wordpress.com/2015/02/dsc01921.jpg?w=780&quality=80&strip=all

You can drop a snippet like this in a plugin to set the quality and strip parameters for every image on the site.

add_filter('jetpack_photon_pre_args', 'jetpackme_custom_photon_compression' );
function jetpackme_custom_photon_compression( $args ) {
    $args['quality'] = 80;
    $args['strip'] = 'all';
    return $args;
}

The results can be pretty dramatic. At full size, this image of downtown Madison goes from 16MB to 2.7MB by setting the quality to 80%. That’s a big deal on a mobile connection and it’s pretty hard to spot the difference on most images unless you’re looking at them side by side.

DSC01921

A more secure REST API

Because privacy and security are important to users across the internet, many services have begun to encrypt the connection between a user’s browser and their servers. The use of SSL (or TLS) largely eliminates the likelihood that a “man-in-the-middle” is able to monitor a user’s activities on the web. To this end, WordPress.com is joining the likes of Google and Facebook in encrypting all of the traffic sent across our network. We are currently in the process of forcing many of our services to be accessible through HTTPS exclusively.

It was previously possible to access the WordPress.com/Jetpack JSON API through HTTP only for unauthenticated requests. As part of the SSL transition, all public-api.wordpress.com endpoints are now accessible via HTTPS only. Any requests made to the HTTP version of the URL will now 301 redirect to the HTTPS version.

What does this mean for you?

For the majority of our API consumers, this won’t require any change as you are likely already using the HTTPS URLs with authenticated endpoints. If you are not, now is the time to update your API calls to the secure URLs.

By making this change, we’re helping make the web a more secure place for our users.

As always, If you have any questions about the API, don’t hesitate to comment below or reach out to us via our developer contact form.

Version 1.1 of the WordPress.com REST API

Today, we’ve launched version 1.1 of the WordPress.com REST API. In recent weeks, we’ve been hard at work launching new features on WordPress.com, and many of these changes are powered by our REST API. When we started working on the upgrades to stats and post management, we quickly realized that the existing endpoints didn’t have all the power we needed to provide the best experience. In order to add the functionality we needed to the API without breaking existing implementations, we decided to version our API.

What does this mean for you?

If you’re already implementing version 1 of the API, you’ll be able to continue using those endpoints without changing your code for the foreseeable future. Version 1 of the API is now deprecated, so any new development you do should be against 1.1. We currently have no plans to disable version 1 of the REST API — should we ever decide to do so, we’ll give you plenty of advanced notice.

Media Endpoints

  • Upload support for all file types. If you can you upload it though the media explorer, you can upload it with the API. PDFs, Docs, Powerpoints, Audio files, and Videos (Jetpack & .com blogs with VideoPress) are all supported.
  • Better error handling. If you upload multiple files and some fail, it’s easier to pull those out and retry.
  • Improved consistency with other endpoints and cleaned up response parameters.
  • When uploading files, you can now pass attributes like name and description without needing to do a second call to the update endpoint.
  • Bonus: The /sites/$site/ endpoint now returns a list of allowed file types.

Stats Endpoints

  • Support for pulling back stats over multiple days without those stats being grouped into a single result.
  • New stats detailing the top comment authors on your site, as well as the posts that have received the most comments.
  • In addition to chart data for views and visitors: chartable data about likes and comments.
  • Keep on track with your posting goals. The new streak endpoint contains the data to help motivate you to post more often.

We’re looking forward to seeing what you build using version 1.1. Take a look at the REST API documentation to get started. If you have any questions about the API, don’t hesitate to comment below or reach out to us via our developer contact form.

On API Correctness

Developing APIs is hard.

You pour your blood, sweat, and tears into this interface that bares the soul of your company and of your product to the world. The machinery under the hood, though, is often a lot less polished than the fancy paint job would lead the rest of the world to believe. You have to be careful, then, not to inflict your own rough edges on the people you expect to be consuming your API because…

Using APIs is hard.

As an app developer you’re trying to take someone else’s product and somehow integrate it into whatever vision you have in your head. Whether it’s simply getting a list of things from another service (such as embedding a reading list) or wrapping your entire product around another product (using Amazon S3 as your primary binary storage mechanism, for example), you have a lot of things to reconcile.

You have your own programming language (or languages) that you’re using. There’s the use case you have in mind, and the ones the remote devs had in mind for the API. There’s the programming language they used to create the API (and that they used to test it). Finally, don’t forget the encoding or representation of the data — and its limitations. Reconciling all of the slight (or major) differences between these elements is a real challenge sometimes. Despite years of attempts at best practices and industry standards, things just don’t always fit together like we pretend that they will.

As a developer providing an API it’s important to remember three things. There are obviously many other things to consider, but these three things are more universal than most.

#1 You want people to use your API.

Unless you’re developing a completely internal API, you’re hoping that the world sees your API as something amazing, and that your functionality starts popping up in other magical places without any further effort on your part.

#2 You have no control over what tools others are using.

Are you using a language that has little or no variable type enforcement? Some people aren’t. Some of those people still want to use your product. Did you come up with your own way of doing things with custom code instead of using widely-adopted industry standards (which, being widely deployed, come with battle-tested libraries in many languages)? Did that cause you to release a client in your own language (how about Clojure, how about Erlang, how about C++, how about Perl, how about…)?

#3 Your API is a promise.

It’s easy to forget (especially for those of us who spend our time in a forgiving language such as PHP or Python) that the API we provide is a promise to the rest of the world. What it promises is this: “When you provide me with ${this} I will provide you with ${that}”.

The super-important (and insidiously non-obvious) thing about this is that if you do not provide a written promise (in the form of your API’s documentation), then the behavior of your API becomes the implicit promise.

The most important thing to note here is that when your documentation is wrong, the promise of your actual behavior wins every single time.

Keep your promises

When your promises don’t match your actual results things get hairy.

Let’s take a look at a completely hypothetical situation.

  1. You have an API that is documented to return a json object with a success member which should be a boolean value.
  2. You have a case (maybe all cases) where success is actually rendered as an integer (0 for false, 1 for true).
  3. John has an app written in a strongly-typed language that works around this by defining success as an integer type instead of a boolean type. Because John was busy, he never got around to letting you know. Or maybe John never knew because he simply inspected your API and worked backwards from the responses that you gave. Now John’s app has 100k users depending on this functionality.
  4. Mary is writing an app, and because Mary doesn’t like to play fast and loose (and she doesn’t want her app to break later on) she submits an issue pointing out that you are returning the wrong type.

At this point you are trapped. The existing user base (and by extension their user base) is committed to integers. And you only have four options.

  1. You can cripple an existing and deployed application enjoyed by 100k users.
  2. You can version your API — an entire new version to correct what should be a boolean value.
  3. You can work with John to roll out a new version of the app which can handle both (but maybe his app is in the iOS app store, and getting everyone to update is impossible, takes a long time, and/or would require a lengthy, and potentially costly, review process by yet another party).
  4. You make a really sad face and change your promise — to reflect that you are going to do what is actually the less correct thing, forever.

Because you wrote an API whose promise was wrong, or whose promise was missing, you have painted yourself into a very undesirable corner. You’re now in a place where doing the right thing for the right reasons is the wrong move.

So do yourself, and everyone else, one of two favors — depending on the position in which you find yourself.

If you’re producing an API, take extra care to make sure that your results match your documentation (and you need to have documentation).

If you’re consuming an API, don’t be like John. Don’t work backwards from the data — work forwards from the docs. And if the docs are wrong you should submit a ticket and wait for it to be fixed (or at the very, very least, make sure your workaround deals with both the documented expectation and the actual incorrect return value).

In conclusion

Just like a child, it takes a village to raise a good, decent, hard working API.