‌’
Apostrophe
'
Typewriter Apostrophe
Punctuation
apostrophe ( ’ ' )
brackets ( [ ], ( ), { }, ⟨ ⟩ )
colon ( : )
comma ( , ، 、 )
dash ( , –, —, ― )
ellipsis ( …, ..., . . . )
exclamation mark ( ! )
full stop/period ( . )
guillemets ( « » )
hyphen ( )
hyphen-minus ( - )
question mark ( ? )
quotation marks ( ‘ ’, “ ”, ' ', " " )
semicolon ( ; )
slash‌/stroke‌/solidus ( /,  ⁄  )
Word dividers
space ( ) ( ) ( )
interpunct ( · )
General typography
ampersand ( & )
at sign ( @ )
asterisk ( * )
backslash ( \ )
bullet ( )
caret ( ^ )
dagger ( †, ‡ )
degree ( ° )
ditto mark ( )
inverted exclamation mark ( ¡ )
inverted question mark ( ¿ )
number sign‌/pound‌/hash ( # )
numero sign ( )
obelus ( ÷ )
ordinal indicator ( º, ª )
percent, per mil ( %, ‰, )
pilcrow ( )
prime ( ′, ″, ‴ )
section sign ( § )
tilde ( ~ )
underscore‌/understrike ( _ )
vertical bar‌/broken bar‌/pipe ( ¦, | )
Intellectual property
copyright symbol ( © )
registered trademark ( ® )
service mark ( )
sound recording copyright ( )
trademark ( )
Currency
currency (generic) ( ¤ )
currency (specific)
( ฿ ¢ $ ƒ £ ¥ )
Uncommon typography
asterism ( )
tee ( )
up tack ( )
index/fist ( )
therefore sign ( )
because sign ( )
interrobang ( )
irony punctuation ( ؟ )
lozenge ( )
reference mark ( )
tie ( )
Related
diacritical marks
whitespace characters
non-English quotation style ( « », „ ” )
In other scripts
Chinese punctuation
Wikipedia book Book  · Category Category  · Portal

The apostrophe  although often rendered as  ' ) is a punctuation mark, and sometimes a diacritic mark, in languages that use the Latin alphabet or certain other alphabets. In English, it serves three purposes:[1]

  • The marking of the omission of one or more letters (as in the contraction of do not to don't).
  • The marking of possessive case (as in the cat's whiskers).
  • The marking as plural of written items that are not words established in English orthography (as in P's and Q's, the late 1950's). (This is considered incorrect by some; see Use in forming certain plurals. The use of the apostrophe to form plurals of proper words, as in apple's, banana's, etc., is universally considered incorrect.)

According to the Oxford English Dictionary (OED), the word comes ultimately from Greek ἡ ἀπόστροφος [προσῳδία] (hē apóstrophos [prosōidía], "[the accent of] 'turning away', or elision"), through Latin and French.[2]

The apostrophe usually looks the same as a closing single quotation mark, although they have different meanings. The apostrophe also looks similar to the prime symbol ( ′ ), which is used to indicate measurement in feet or arcminutes, as well as for various mathematical purposes, and the ʻokinaʻ ), which represents a glottal stop in Polynesian languages.

Contents

English language usage[link]

Historical development[link]

The apostrophe was introduced into English in the 16th century in imitation of French practice.[3]

French practice[link]

Introduced by Geoffroy Tory (1530), the apostrophe was used in place of a vowel letter to indicate elision (as in l'heure in place of la heure). It was frequently used in place of letter e when no actual vowel sound was elided (as in un' heure). Modern French orthography has restored the spelling une heure.[4]

Early English practice[link]

From the 16th century, following French practice, the apostrophe was used when a vowel letter was omitted either because of incidental elision (I'm for I am) or because the letter no longer represented a sound (lov'd for loved). English spelling retained many inflections that were not pronounced as syllables, notably verb endings (-est, -eth, -es, -ed) and the noun ending -es, which marked either plurals or possessives (also known as genitives; see Possessive apostrophe, below). So apostrophe followed by s was often used to mark a plural, especially when the noun was a loan word (and especially a word ending in a, as in the two comma's).[3]

Standardisation[link]

The use of elision has continued to the present day, but significant changes have been made to the possessive and plural uses. By the 18th century, apostrophe + s was regularly used for all possessive singular forms, even when the letter e was not omitted (as in the gate's height). This was regarded as representing the Old English genitive singular inflection -es. The plural use was greatly reduced, but a need was felt to mark possessive plural. The solution was to use an apostrophe after the plural s (as in girls' dresses). However, this was not universally accepted until the mid-19th century.[3]

Possessive apostrophe[link]

The apostrophe is used to indicate possession. This convention distinguishes possessive singular forms (Bernadette's, flower's, glass's, one's) from simple plural forms (Bernadettes, flowers, glasses, ones), and both of those from possessive plural forms (Bernadettes', flowers', glasses', ones'). For singulars, the modern possessive or genitive inflection is a survival from certain genitive inflections in Old English, and the apostrophe originally marked the loss of the old e (for example, lambes became lamb's).

General principles for the possessive apostrophe[link]

Summary of rules for most situations
  • Possessive personal pronouns, serving as either noun-equivalents or adjective-equivalents, do not use an apostrophe, even when they end in s. The complete list of those ending in the letter s or the corresponding sound /s/ or /z/ but not taking an apostrophe is ours, yours, his, hers, its, theirs, and whose.
  • Other pronouns, singular nouns not ending in s, and plural nouns not ending in s all take 's in the possessive: e.g., someone's, a cat's toys, women's.
  • Plural nouns already ending in s take only an apostrophe after the pre-existing s when the possessive is formed: e.g., three cats' toys.
Basic rule (singular nouns)

For most singular nouns the ending 's is added; e.g., the cat's whiskers.

  • If a singular noun ends with an s-sound (spelled with -s, -se, for example), practice varies as to whether to add 's or the apostrophe alone. A widely accepted practice is to follow whichever spoken form is judged better: the boss's shoes, Mrs Jones' hat (or Mrs Jones's hat, if that spoken form is preferred). In many cases, both spoken and written forms differ between writers. (See details below.)

Basic rule (plural nouns)

When the noun is a normal plural, with an added s, no extra s is added in the possessive; so pens' caps (where there is more than one pen) is correct rather than pens's caps.

  • If the plural is not one that is formed by adding s, an s is added for the possessive, after the apostrophe: children's hats, women's hairdresser, some people's eyes (but compare some peoples' recent emergence into nationhood, where peoples is meant as the plural of the singular people). These principles are universally accepted.
  • A few English nouns have plurals that are not spelled with a final s but end in an /s/ or a /z/ sound: mice (plural of mouse, and for compounds like dormouse, titmouse), dice (when used as the plural of die), pence (a plural of penny, with compounds like sixpence that now tend to be taken as singulars). In the absence of specific exceptional treatment in style guides, the possessives of these plurals are formed by adding an apostrophe and an s in the standard way: seven titmice's tails were found, the dice's last fall was a seven, his few pence's value was not enough to buy bread. These would often be rephrased, where possible: the last fall of the dice was a seven.[5]
Basic rule (compound nouns)

Compound nouns have their singular possessives formed with an apostrophe and an added s, in accordance with the rules given above: the Attorney-General's husband; the Lord Warden of the Cinque Ports' prerogative; this Minister for Justice's intervention; her father-in-law's new wife.

  • In such examples, the plurals are formed with an s that does not occur at the end: e.g., attorneys-general. A problem therefore arises with the possessive plurals of these compounds. Sources that rule on the matter appear to favour the following forms, in which there is both an s added to form the plural, and a separate s added for the possessive: the attorneys-general's husbands; successive Ministers for Justice's interventions; their fathers-in-law's new wives.[6] Because these constructions stretch the resources of punctuation beyond comfort, in practice they are normally reworded: interventions by successive Ministers for Justice.[7][8]
Joint and separate possession

A distinction is made between joint possession (Jason and Sue's e-mails: the e-mails of both Jason and Sue), and separate possession (Jason's and Sue's e-mails: both the e-mails of Jason and the e-mails of Sue). Style guides differ only in how much detail they provide concerning these.[9] Their consensus is that if possession is joint, only the last possessor has possessive inflection; in separate possession all the possessors have possessive inflection. If, however, any of the possessors is indicated by a pronoun, then for both joint and separate possession all of the possessors have possessive inflection (his and her e-mails; his, her, and Anthea's e-mails; Jason's and her e-mails; His and Sue's e-mails; His and Sue's wedding; His and Sue's weddings).

Note that in cases of joint possession, the above rule does not distinguish between a situation in which only one or more jointly possessed items perform a grammatical role and a situation in which both one or more such items and a non-possessing entity independently perform that role. Although verb number suffices in some cases ("Jason and Sue's dog has porphyria") and context suffices in others ("Jason and Sue's e-mails rarely exceed 200 characters in length"), number and grammatical position often prevent a resolution of ambiguity:

  • Where multiple items are possessed and context is not dispositive, a rule forbidding distribution of the possessive merely shifts ambiguity: suppose that Jason and Sue had one or more children who died in a car crash and that none of Jason's children by anyone other than Sue were killed. Under a rule forbidding distribution of the joint possessive, writing "Jason and Sue's children [rather than "Jason's and Sue's children"] died in the crash" eliminates the implication that Jason lost children of whom Sue was not the mother, but it introduces ambiguity as to whether Jason himself was killed.
  • Moreover, if only one item is possessed, the rule against distribution of the joint possessive introduces ambiguity (unless the context happens to resolve it): read in light of a rule requiring distribution, the sentence "Jason and Sue's dog died after being hit by a bus" makes clear that the dog belonged to Sue alone and that Jason survived or was not involved, whereas a rule prohibiting distribution forces ambiguity as to both whether Jason (co-)owned the dog and whether he was killed.
With other punctuation; compounds with pronouns

If the word or compound includes, or even ends with, a punctuation mark, an apostrophe and an s are still added in the usual way: "Westward Ho!'s railway station;" Awaye!'s Paulette Whitten recorded Bob Wilson's story;[10] Washington, D.C.'s museums,[11] assuming that the prevailing style requires full stops in D.C.

  • If the word or compound already includes a possessive apostrophe, a double possessive results: Tom's sisters' careers; the head of marketing's husband's preference; the master of foxhounds' best dog's death. Some style guides, while allowing that these constructions are possible, advise rephrasing: the preference of the head of marketing's husband. If an original apostrophe, or apostrophe with s, occurs at the end, it is left by itself to do double duty: Our employees are better paid than McDonald's employees; Standard & Poor's indexes are widely used; the 5uu's first album (the fixed forms of McDonald's and Standard & Poor's already include possessive apostrophes; 5uu's already has a non-possessive apostrophe before its final s). For similar cases involving geographical names, see below.
  • By extended application of the principles stated above, the possessives of all phrases whose wording is fixed are formed in the same way:
For complications with foreign phrases and titles, see below.
Time, money, and similar

An apostrophe is used in time and money references, among others, in constructions such as one hour's respite, two weeks' holiday, a dollar's worth, five pounds' worth, one mile's drive from here. This is like an ordinary possessive use. For example, one hour's respite means a respite of one hour (exactly as the cat's whiskers means the whiskers of the cat). Exceptions are accounted for in the same way: three months pregnant (in modern usage, we do not say pregnant of three months, nor one month(')s pregnant).

Possessive pronouns and adjectives

No apostrophe is used in the following possessive pronouns and adjectives: yours, his, hers, ours, its, theirs, and whose.

The possessive of it was originally it's, and many people continue to write it this way, though the apostrophe was dropped in the early 1800s and authorities are now unanimous that it's can be only a contraction of it is or it has.[14][15] For example, US President Thomas Jefferson used it's as a possessive in his instructions dated 20 June 1803 to Lewis for his preparations for his great expedition.[16]

All other possessive pronouns ending in s do take an apostrophe: one's; everyone's; somebody's, nobody else's, etc. With plural forms, the apostrophe follows the s, as with nouns: the others' husbands (but compare They all looked at each other's husbands, in which both each and other are singular).

Importance for disambiguation

Each of these four phrases (listed in Steven Pinker's The Language Instinct) has a distinct meaning:

  • My sister's friend's investments (the investments belonging to a friend of my sister)
  • My sister's friends' investments (the investments belonging to several friends of my sister)
  • My sisters' friend's investments (the investments belonging to a friend of several of my sisters)
  • My sisters' friends' investments (the investments belonging to several friends of several of my sisters)

Kingsley Amis, on being challenged to produce a sentence whose meaning depended on a possessive apostrophe, came up with:

  • Those things over there are my husband's. (Those things over there belong to my husband.)
  • Those things over there are my husbands'. (Those things over there belong to several husbands of mine.)
  • Those things over there are my husbands. (I'm married to those men over there.)[17]

Singular nouns ending with an "s" or "z" sound[link]

This subsection deals with singular nouns pronounced with a sibilant sound at the end: /s/ or /z/. The spelling of these ends with -s, -se, -z, -ze, -ce, -x, or -xe.

Many respected authorities recommend that practically all singular nouns, including those ending with a sibilant sound, have possessive forms with an extra s after the apostrophe so that the spelling reflects the underlying pronunciation. Examples include Oxford University Press, the Modern Language Association, the BBC and The Economist.[18] Such authorities demand possessive singulars like these: Senator Jones's umbrella; Tony Adams's friend. Rules that modify or extend the standard principle have included the following:

  • If the singular possessive is difficult or awkward to pronounce with an added sibilant, do not add an extra s; these exceptions are supported by The Guardian,[19] Yahoo! Style Guide,[20] The American Heritage Book of English Usage[21] Such sources permit possessive singulars like these: Socrates' later suggestion; or Achilles' heel if that is how the pronunciation is intended.
  • Classical, biblical, and similar names ending in a sibilant, especially if they are polysyllabic, do not take an added s in the possessive; among sources giving exceptions of this kind are The Times[22] and The Elements of Style, which make general stipulations, and Vanderbilt University,[23] which mentions only Moses and Jesus. As a particular case, Jesus'  is very commonly written instead of Jesus's – even by people who would otherwise add 's in, for example, James's or Chris's. Jesus'  is referred to as "an accepted liturgical archaism" in Hart's Rules.

However, some contemporary writers still follow the older practice of omitting the extra s in all cases ending with a sibilant, but usually not when written -x or -xe.[24] Some contemporary authorities such as the Associated Press Stylebook[25] and The Chicago Manual of Style recommend or allow the practice of omitting the extra "s" in all words ending with an "s", but not in words ending with other sibilants ("z" and "x").[26] The 15th edition of The Chicago Manual of Style still recommended the traditional practice, which included providing for several exceptions to accommodate spoken usage such as the omission of the extra s after a polysyllabic word ending in a sibilant. The 16th edition of CMOS no longer recommends omitting the extra "s".[27]

Similar examples of notable names ending in an s that are often given a possessive apostrophe with no additional s include Dickens and Williams. There is often a policy of leaving off the additional s on any such name, but this can prove problematic when specific names are contradictory (for example, St James' Park in Newcastle [the football ground] and the area of St. James's Park in London). For more details on practice with geographic names, see the relevant section below.

Some writers like to reflect standard spoken practice in cases like these with sake: for convenience' sake, for goodness' sake, for appearance' sake, for compromise' sake, etc. This punctuation is preferred in major style guides. Others prefer to add 's: for convenience's sake.[28] Still others prefer to omit the apostrophe when there is an s sound before sake: for morality's sake, but for convenience sake.[29]

The Supreme Court of the United States is split on whether a possessive singular noun that ends with s should always have an additional s after the apostrophe, sometimes have an additional s after the apostrophe (for instance, based on whether the final sound of the original word is pronounced /s/ or /z/), or never have an additional s after the apostrophe. The informal majority view (5–4, based on past writings of the justices) has favoured the additional s, but a strong minority disagrees.[30]

Nouns ending with silent "s", "x" or "z"[link]

The English possessive of French nouns ending in a silent s, x, or z is rendered differently by different authorities. Some people prefer Descartes' and Dumas', while others insist on Descartes's and Dumas's.[citation needed] Certainly a sibilant is pronounced in these cases; the theoretical question is whether the existing final letter is sounded or whether s needs to be added.[citation needed] Similar examples with x or z: Sauce Périgueux's main ingredient is truffle; His pince-nez's loss went unnoticed; "Verreaux('s) eagle, a large, predominantly black eagle, Aquila verreauxi,..." (OED, entry for "Verreaux", with silent x; see Verreaux's eagle); in each of these some writers might omit the added s. The same principles and residual uncertainties apply with "naturalised" English words, like Illinois and Arkansas.[31]

For possessive plurals of words ending in silent x, z or s, the few authorities that address the issue at all typically call for an added s and require that the apostrophe precede the s: The Loucheux's homeland is in the Yukon; Compare the two Dumas's literary achievements.[32] The possessive of a cited French title with a silent plural ending is uncertain: "Trois femmes's long and complicated publication history",[33] but "Les noces' singular effect was 'exotic primitive'..." (with nearby sibilants -ce- in noces and s- in singular).[34] Compare treatment of other titles, above.

Guides typically seek a principle that will yield uniformity, even for foreign words that fit awkwardly with standard English punctuation.

Possessives in geographic names[link]

Place names in the United States do not use the possessive apostrophe on federal maps and signs.[35] The United States Board on Geographic Names, which has responsibility for formal naming of municipalities and geographic features, has deprecated the use of possessive apostrophes since 1890 so as not to show ownership of the place.[35][36] Only five names of natural features in the U.S. are officially spelled with a genitive apostrophe (one example being Martha's Vineyard).[36][37] "

On the other hand, the United Kingdom has Bishop's Stortford, Bishop's Castle and King's Lynn (but St Albans, St Andrews and St Helens possibly because their names date to before the use was formalised[citation needed]) and, while Newcastle United play at a stadium previously called St James' Park, and Exeter City at St James Park, London has a St James's Park (this whole area of London is named after St James's Church, Piccadilly[38]). The special circumstances of the latter case may be this: the customary pronunciation of this place name is reflected in the addition of an extra -s; since usage is firmly against a doubling of the final -s without an apostrophe, this place name has an apostrophe. This could be regarded by some people as an example of a double genitive: it refers to the park of the church of St James.

Omission of the apostrophe in geographical names is becoming standard in some English-speaking countries, including Australia.[39] Modern usage has been influenced by considerations of technological convenience including the economy of typewriter ribbons and films, and similar computer character "disallowance" which tend to ignore traditional canons of correctness.[40] Practice in the United Kingdom and Canada is not so uniform.[41]

Possessives in names of organizations[link]

Sometimes the apostrophe is omitted in the names of clubs, societies, and other organizations, even though the standard principles seem to require it: Country Women's Association, but International Aviation Womens Association;[42] Magistrates' Court of Victoria,[43] but Federated Ship Painters and Dockers Union. Usage is variable and inconsistent. Style guides typically advise consulting an official source for the standard form of the name; some tend towards greater prescriptiveness, for or against such an apostrophe.[44] As the case of womens shows, it is not possible to analyze these forms simply as non-possessive plurals, since women is the only correct plural form of woman.

Possessives in business names[link]

Where a business name is based on a family name it should take an apostrophe, but many leave it out (contrast Sainsbury's with Harrods). In recent times there has been an increasing tendency to drop the apostrophe. Names based on a first name are more likely to take an apostrophe (Joe's Crab Shack). Some business names may inadvertently spell a different name if the name with an s at the end is also a name, such as Parson. A small activist group called the Apostrophe Protection Society[45] has campaigned for large retailers such as Harrods, Currys, and Selfridges to reinstate their missing punctuation. A spokesperson for Barclays PLC stated, "It has just disappeared over the years. Barclays is no longer associated with the family name."[46] Further confusion can be caused by businesses whose names tend to look like they are pronounced differently without an apostrophe such as Paulos Circus, and other companies that leave the apostrophe out of their logos but include it in written text, such as Waterstone's and Cadwalader's.

Apostrophe showing omission[link]

An apostrophe is commonly used to indicate omitted characters, normally letters:

  • It is used in contractions, such as can't from cannot, it's from it is or it has, and I'll from I will or I shall.[47]
  • It is used in abbreviations, as gov't for government. It may indicate omitted numbers where the spoken form is also capable of omissions, as '70s for 1970s representing seventies for nineteen-seventies. In modern usage, apostrophes are generally omitted when letters are removed from the start of a word, particularly for a compound word. For example, it is not common to write 'bus (for omnibus), 'phone (telephone), 'net (Internet). However, if the shortening is unusual, dialectal or archaic, the apostrophe may still be used to mark it (e.g., 'bout for about, 'less for unless, 'twas for it was). Sometimes a misunderstanding of the original form of a word results in an incorrect contraction. A common example: 'til for until, though till is in fact the original form, and until is derived from it.
    • The spelling fo'c's'le, contracted from the nautical term forecastle, is unusual for having three apostrophes. The spelling bo's'n's (from boatswain's), as in Bo's'n's Mate, also has three apostrophes, two showing omission and one possession. Fo'c's'le may also take a possessive s – as in the fo'c's'le's timbers – giving four apostrophes in one word.[48]
  • It is sometimes used when the normal form of an inflection seems awkward or unnatural; for example, KO'd rather than KOed (where KO is used as a verb meaning "to knock out"); "a spare pince-nez'd man" (cited in OED, entry for "pince-nez"; pince-nezed is also in citations).
  • In certain colloquial contexts, an apostrophe's function as possessive or contractive can depend on other punctuation.
    • We rehearsed for Friday's opening night. (We rehearsed for the opening night on Friday.)
    • We rehearsed because Friday's opening night. (We rehearsed because Friday is opening night. "Friday's" here is a contraction of "Friday is.")
  • Eye dialects use apostrophes in creating the effect of a non-standard pronunciation.

Use in forming certain plurals[link]

An apostrophe is used by some writers to form a plural for abbreviations, acronyms, and symbols where adding just s rather than 's may leave things ambiguous or inelegant. Some specific cases:

  • It is generally acceptable to use apostrophes to show plurals of single lower-case letters,[49] such as be sure to dot your i's and cross your t's. Some style guides would prefer to use a change of font: dot your is and cross your ts.[citation needed] Some style guides rule that upper case letters need no apostrophe (I got three As in my exams[49]) except when there is a risk of misreading, such as at the start of a sentence: A's are the highest marks achievable in these exams.
  • For groups of years, the apostrophe at the end is unnecessary, since there is no possibility of misreading. For this reason, some style guides prefer 1960s to 1960's[49] (although the latter is noted by at least one source as acceptable in American usage),[50] and 90s or '90s to 90's or '90's.
  • The apostrophe is sometimes used in forming the plural of numbers (for example, 1000's of years); however, as with groups of years, it is unnecessary because there is no possibility of misreading. Most sources are against this usage.[citation needed]
  • The apostrophe is often used in plurals of symbols. Again, since there can be no misreading, this is often regarded as incorrect.[49] That page has too many &s and #s on it.[citation needed]

Use in non-English names[link]

Names that are not strictly native to English sometimes have an apostrophe substituted to represent other characters (see also As a mark of elision, below).

  • Anglicised versions of Irish surnames often contain an apostrophe after an O, for example O'Doole.
  • Some Scottish and Irish surnames use an apostrophe after an M, for example M'Gregor. The apostrophe here may be seen as marking a contraction where the prefix Mc or Mac would normally appear. (In earlier and meticulous current usage, the symbol is actually – a kind of reversed apostrophe that is sometimes called a turned comma, which eventually came to be written as the letter c, whose shape is similar.)[51]
  • In science fiction, the apostrophe is often used in alien names, sometimes to indicate a glottal stop (for example T'Pau in Star Trek), but also sometimes simply for decoration.

Use in transliteration[link]

In transliterated foreign words, an apostrophe may be used to separate letters or syllables that otherwise would likely be interpreted incorrectly. For example:

  • in the Arabic word mus'haf, a transliteration of مصحف, the syllables are as in mus·haf, not mu·shaf
  • in the Japanese name Shin'ichi, the apostrophe shows that the pronunciation is shi·n·i·chi (hiragana しんいち), where the letters n () and i () are separate moras, rather than shi·ni·chi (しにち).
  • in the Chinese Pinyin romanization, when two hanzi are combined to form one word, if the resulting Pinyin representation can be mis-interpreted they should be separated by an apostrophe. For example, 先 (xiān) 西安 (xī'ān).

Furthermore, an apostrophe may be used to indicate a glottal stop in transliterations. For example:

  • in the Arabic word Qur'an, a common transliteration of (part of) القرآن al-qur'ān, the apostrophe corresponds to the diacritic maddah over the 'alif, one of the letters in the Arabic alphabet

Rather than ʿ the apostrophe is sometimes used to indicate a voiced pharyngeal fricative as it sounds and looks like the glottal stop to most English speakers. For example:

  • in the Arabic word Ka'aba for الكعبة al-kaʿbah, the apostrophe corresponds to the Arabic letter ʿayn.

Non-standard English use[link]

Failure to observe standard use of the apostrophe is widespread and frequently criticised as incorrect,[52][53] often generating heated debate. The British founder of the Apostrophe Protection Society earned a 2001 Ig Nobel prize for "efforts to protect, promote and defend the differences between plural and possessive".[54] A 2004 report by OCR, a British examination board, stated that "the inaccurate use of the apostrophe is so widespread as to be almost universal".[55] A 2008 survey found that nearly half of the UK adults polled were unable to use the apostrophe correctly.[53]

[edit] Superfluous apostrophes ("greengrocers' apostrophes")

Sign to Green Craigs housing development

Apostrophes used in a non-standard manner to form noun plurals are known as greengrocers' apostrophes or grocers' apostrophes, often called (spelled) greengrocer's apostrophes[56] and grocer's apostrophes.[57] They are sometimes humorously called greengrocers apostrophe's, rogue apostrophes, or idiot's apostrophes (a literal translation of the German word Deppenapostroph, which criticises the misapplication of apostrophes in Denglisch). The practice, once common and acceptable (see Historical development), comes from the identical sound of the plural and possessive forms of most English nouns. It is often criticised as a form of hypercorrection coming from a widespread ignorance of the proper use of the apostrophe or of punctuation in general. Lynne Truss, author of Eats, Shoots & Leaves, points out that before the 19th century, it was standard orthography to use the apostrophe to form a plural of a foreign-sounding word that ended in a vowel (e.g., banana's, folio's, logo's, quarto's, pasta's, ouzo's) to clarify pronunciation. Truss says this usage is no longer considered proper in formal writing.[58]

The term is believed to have been coined in the middle of the 20th century by a teacher of languages working in Liverpool, at a time when such mistakes were common in the handwritten signs and advertisements of greengrocers (e.g., Apple's 1/- a pound, Orange's 1/6d a pound). Some have argued that its use in mass communication by employees of well-known companies has led to the less literate assuming it to be correct and adopting the habit themselves.[59]

The same use of apostrophe before noun plural -s forms is sometimes made by non-native speakers of English. For example, in Dutch, the apostrophe is inserted before the s when pluralising most words ending in a vowel or y for example, baby's (English babies) and radio's (English "radios"). This often produces so-called "Dunglish" errors when carried over into English.[60] Hyperforeignism has been formalised in some pseudo-anglicisms. For example, the French word pin's (from English pin) is used (with the apostrophe in both singular and plural) for collectable lapel pins. Similarly, there is an Andorran football club called FC Rànger's (after such British clubs as Rangers F.C.), a Japanese dance group called Super Monkey's, and a Japanese pop punk band called the Titan Go King's.[61]

The widespread use of apostrophes before the s of plural nouns has led to the incorrect belief that an apostrophe is also needed before the s of the third-person present tense of a verb. Thus, he take's, it begin's, etc.[citation needed]

Omission[link]

There is a tendency to drop apostrophes in many commonly used names such as St Annes, St Johns Lane,[62] and so on.

In 2009, a resident in Royal Tunbridge Wells was accused of vandalism after he painted apostrophes on road signs that had spelt St John's Close as St Johns Close.[63]

UK supermarket chain Tesco omits the mark where standard practice would require it. Signs in Tesco advertise (among other items) "mens magazines", "girls toys", "kids books" and "womens shoes". In his book Troublesome Words, author Bill Bryson lambasts Tesco for this, stating that "the mistake is inexcusable, and those who make it are linguistic Neanderthals."[64]

Advocates of greater or lesser use[link]

A sign diverting passengers to a temporary taxi rank at Leeds railway station, West Yorkshire, United Kingdom, with the extraneous apostrophe crossed out by an unknown copy editor

George Bernard Shaw, a proponent of English spelling reform on phonetic principles, argued that the apostrophe was mostly redundant. He did not use it for spelling cant, hes, etc. in many of his writings. He did however allow I'm and it's.[65] Hubert Selby, Jr. used a slash instead of an apostrophe mark for contractions and did not use an apostrophe at all for possessives. Lewis Carroll made greater use of apostrophes, and frequently used sha'n't, with an apostrophe in place of the elided "ll" as well as the more usual "o".[66][citation needed] These authors' usages have not become widespread.

Other misuses[link]

The British pop group Hear'Say famously made unconventional use of an apostrophe in its name. Truss comments that "the naming of Hear'Say in 2001 was [...] a significant milestone on the road to punctuation anarchy".[67] Dexys Midnight Runners, on the other hand, omit the apostrophe (though "dexys" can be understood as a plural form of "dexy", rather than a possessive form).

An apostrophe wrongly thought to be misused in popular culture occurs in the name of Liverpudlian rock band The La's. This apostrophe is often thought to be a mistake; but in fact it marks omission of the letter d. The name comes from the Scouse slang for "The Lads".

Criticism[link]

Over the years, the use of apostrophes has been criticized. George Bernard Shaw called them "uncouth bacilli". In his book, American Speech, linguist Steven Byington stated of the apostrophe that "the language would be none the worse for its abolition." Adrian Room in his English Journal article "Axing the Apostrophe" argued that apostrophes are unnecessary and context will resolve any ambiguity.[68] In a letter to the English Journal, Peter Brodie stated that apostrophes are "largely decorative...[and] rarely clarify meaning".[69] Dr. John C. Wells, Emeritus Professor of Phonetics at University College London, says the apostrophe is "a waste of time". Peter Buck, guitarist of R.E.M. claimed "We all hate apostrophes. There's never been a good rock album that's had an apostrophe in the title".[68]

Non-English use[link]

As a mark of elision[link]

In many languages, especially European languages, the apostrophe is used to indicate the elision of one or more sounds, as in English.

  • In Afrikaans the apostrophe is used to show that letters have been omitted from words. The most common use is in the indefinite article 'n, which is a contraction of een meaning "one" (the number). As the initial e is omitted and cannot be capitalised, if a sentence begins with 'n the second word in the sentence is capitalised. For example: 'n Boom is groen, "A tree is green". In addition, the apostrophe is used for plurals and diminutives where the root ends with certain vowels, e.g. foto's, taxi's, Lulu's, Lulu'tjie, garage's etc.[70]
  • In Danish, apostrophes are sometimes seen on commercial materials. One might commonly see Ta' mig med ("Take me with [you]") next to a stand with advertisement leaflets; that would be written Tag mig med in standard orthography. As in German, the apostrophe must not be used to indicate the possessive, except when there is already an s present in the base form, as in Lukas' bog.
  • In Dutch, the apostrophe is used to indicate omitted characters. For example, the indefinite article een can be shortened to 'n, and the definite article het shortened to 't. When this happens in the first word of a sentence, the second word of the sentence is capitalised. In general, this way of using the apostrophe is considered non-standard, except in 's morgens, 's middags, 's avonds, 's nachts (for des morgens, des middags, des avonds, des nachts: "at morning, at afternoon, at evening, at night"). In addition, the apostrophe is used for plurals where the singulars end with certain vowels, e.g. foto's, taxi's; and for the genitive of proper names ending with these vowels, e.g. Anna's, Otto's. These are in fact elided vowels; use of the apostrophe prevents spellings like fotoos and Annaas.
  • In Esperanto, the Fundamento limits the elision mark to the definite article l' (from la) and singular nominative nouns (kor' from koro, "heart"). This is mostly confined to poetry. Idiomatic phrases such as dank' al (from (kun) danko al, "thanks to") and del' (from de la, "of the") are nonetheless frequent. In-word elision is usually marked with a hyphen, as in D-ro (from doktoro, "Dr"). Some early guides used and advocated the use of apostrophes between word parts, to aid recognition of such compound words as gitar'ist'o, "guitarist".
  • In Catalan, French, Italian, Ligurian and Occitan word sequences such as (coup) d'état, (maître) d'hôtel (often shortened to maître d', when used in English), L'Aquila and L'Hospitalet de Llobregat the final vowel in the first word (de "of", la "the", etc.) is elided because the word that follows it starts with a vowel or a mute h. Similarly, French has qu'il instead of que il ("that he"), c'est instead of ce est ("it is or it's"), and so on. Catalan, French, Italian and Occitan surnames sometimes contain apostrophes of elision, e.g. d'Alembert, D'Angelo.
  • French feminine singular possessive adjectives do not undergo elision, but change to the masculine form instead: ma preceding église becomes mon église ("my church").[71]
  • In modern Norwegian, the apostrophe marks that a word has been contracted, such as "ha’kke" from "har ikke" (have not). Unlike English and French, such elisions are not accepted as part of standard orthography but are used to create a more "oral style" in writing. The apostrophe is also used to mark the genitive for words that end in an -s sound: words ending in -s, -x, and -z, some speakers also including words ending in the sound [ʃ]. As Norwegian doesn't form the plural with -s, there is no need to distinguish between an -s forming the possessive and the -s forming the plural. Therefore we have "mann" (man) and "manns" (man's), without apostrophe, but "los" (naval pilot) and "los’" (naval pilot's). Indicating the possessive for former American Presidents George Bush, whose names end in [ʃ], could be written as both Bushs (simply adding an -s to the name) and Bush’ (adding an apostrophe to the end of the name).
  • In Portuguese the apostrophe is also used in some few combinations such as caixa-d'água ("water tower"), galinha-d'angola ("Helmeted Guineafowl"), pau-d'alho ("Gallesia integrifolia"), etc. Portuguese has many contractions between prepositions and articles or pronouns (like na for en + a), but these are written without an apostrophe. Portuguese uses a grave accent to indicate an unstressed a has been elided with a following stressed one, so one writes (and says) àquela hora instead of a aquela hora.
  • Modern Spanish no longer uses the apostrophe to indicate elision in standard writing, although it can sometimes be found in older poetry for that purpose.[72] Instead Spanish writes out the spoken elision in full (de enero, mi hijo) except for the contraction del for de + el, which uses no apostrophe. Spanish also switches to the masculine article immediately before a feminine noun beginning with a stressed a instead of writing (or saying) an elision: un águila blanca, el águila blanca, and el agua pura but una/la blanca águila and la pura agua. This reflects the origin of the Spanish definite articles from the Latin demonstratives ille/illa/illum. Although forms with an apostrophe indicating elision, especially m'ijo and mi'ija for mi hijo and mi hija, can be found in informal writing, this is considered nonstandard.
  • In Swedish, the apostrophe marks an elision, such as "på sta'n", short for "på staden" ("in the city"), to make the text more similar to the spoken language. This is relaxed style, fairly rarely used, and would not be used by traditional newspapers in political articles, but could be used in entertainment related articles and similar.
  • German usage is very similar: an apostrophe is used almost exclusively to indicate omitted letters. It must not be used for plurals or most of the possessive forms (Max' Vater [Max's father] being one of very few exceptions); although both usages are widespread, they are deemed incorrect. The German equivalent of greengrocers' apostrophes would be the derogatory Deppenapostroph ("idiots' apostrophe" (See the article Apostrophitis in German Wikipedia).
  • In modern printings of Ancient Greek, apostrophes are also used to mark elision. Certain Ancient Greek words that end in short vowels elide when the next word starts with a vowel. For example, many Ancient Greek authors would write δ’ ἄλλος (d'állos) for δὲ ἄλλος (dè állos) and ἆρ’ οὐ (âr' ou) for ἆρα οὐ (âra ou).
  • Initialisms in Hebrew are denoted with a geresh, often typed as an apostrophe. A double geresh (״), known by the plural form gershayim, is used to denote acronyms; it is inserted before (i.e., to the right of) the last letter of the acronym.
  • In Irish, the past tense of verbs beginning with an F or vowel begins with d' (elision of do), for example do oscail becomes d'oscail ("opened") and do fhill becomes d'fhill ("returned"). The copula is is often elided to 's, and do ("to"), mo ("my") etc. are elided before f and vowels.
  • In Ganda, when a word ending with a vowel is followed by a word beginning with a vowel, the final vowel of the first word is elided and the initial vowel of the second word lengthened in compensation. When the first word is a monosyllable, this elision is represented in the orthography with an apostrophe: in taata w'abaana "the father of the children", wa ("of") becomes w'; in y'ani? ("who is it?"), ye ("who") becomes y'. But the final vowel of a polysyllable is always written, even if it is elided in speech: omusajja oyo ("this man"), not *omusajj'oyo, because omusajja ("man") is a polysyllable.
  • Welsh uses the apostrophe to mark elision of the definite article yr ("the") following a vowel (a, e, i, o, u, y, w in Welsh), such as i'r tŷ "to the house". It is also used with the particle yn, such as with mae hi'n "she is".

As a glottal stop[link]

Other languages and transliteration systems use the apostrophe or some similar mark to indicate a glottal stop, sometimes considering it a letter of the alphabet:

The apostrophe represents sounds resembling the glottal stop in the Turkic languages and in some romanizations of Semitic languages, including Arabic. In typography, this function may be performed by the closing single quotation mark. In that case, the Arabic letter ‘ayn (ع) is correspondingly transliterated with the opening single quotation mark.

As a mark of palatalization or non-palatalization[link]

Some languages and transliteration systems use the apostrophe to mark the presence, or the lack of, palatalization.

  • In Belarusian and Ukrainian, the apostrophe is used between a consonant and a following "soft" (iotified) vowel (е, ё, ю, я; Uk. є, ї, ю, я) to indicate that no palatalization of the preceding consonant takes place, and the vowel is pronounced in the same way as at the beginning of the word. It therefore marks a morpheme boundary before /j/, and in Ukrainian, is also occasionally as a "quasi letter". It appears frequently in Ukrainian, as, for instance, in the words: <п'ять> [p"jat'] 'five', <від'їзд> [vid'jizd] 'departure', <об'єднаний> [ob'jednanyj] 'united', <з'ясувати> [z'jasuvaty] 'to clear up, explain', <п'єса> [p'jesa] play (drama), etc.[74][75]
  • In Russian and some derived alphabets the same function is served by the hard sign (ъ, formerly called yer). But the apostrophe saw some use as a substitute after 1918, when Soviet authorities enforced an orthographic reform by confiscating type bearing that "letter parasite" from stubborn printing houses in Petrograd.[76]
  • In some Latin transliterations of certain Cyrillic alphabets (for Belarusian, Russian, and Ukrainian), the apostrophe is used to replace the soft sign (ь, indicating palatalization of the preceding consonant), e.g., Русь is transliterated Rus' according to the BGN/PCGN system. (The prime symbol is also used for the same purpose.) Some of these transliteration schemes use a double apostrophe ( " ) to represent the apostrophe in Ukrainian and Belarusian text, e.g. Ukrainian слов’янське ("Slavic") is transliterated as slov"yans’ke.
  • Some Karelian orthographies use an apostrophe to indicate palatalization, e.g. n'evvuo ("to give advice"), d'uuri ("just (like)"), el'vüttiä ("to revive").

To separate morphemes[link]

Some languages use the apostrophe to separate the root of a word and its affixes, especially if the root is foreign and unassimilated. (For another kind of morphemic separation see pinyin, below.)

  • In Danish an apostrophe is sometimes used to join the enclitic definite article to words of foreign origin, or to other words that would otherwise look awkward. For example, one would write IP'en to mean "the IP address". There is some variation in what is considered "awkward enough" to warrant an apostrophe; for instance, long-established words such as firma ("company") or niveau ("level") might be written firma'et and niveau'et, but will generally be seen without an apostrophe. Due to Danish influence, this usage of the apostrophe can also be seen in Norwegian, but is incorrect – a hyphen should be used instead: e.g. CD-en (the CD).
  • In Finnish, apostrophes are used in the declension of foreign names or loan words that end in a consonant when written but are pronounced with a vowel ending, e.g. show'ssa ("in a show"), Bordeaux'hun ("to Bordeaux"). For Finnish as well as Swedish, there is a closely related use of the colon.
  • In Estonian, apostrophes can be used in the declension of some foreign names to separate the stem from any declension endings; e.g., Monet' (genitive case) or Monet'sse (illative case) of Monet (name of the famous painter).
  • In Polish, the apostrophe is used exclusively for marking inflections of words and word-like elements (but not acronyms – a hyphen is used instead) whose spelling conflicts with the normal rules of inflection. This mainly affects foreign words and names. For instance, one would correctly write Kampania Ala Gore'a for "Al Gore's campaign". In this example, Ala is spelt without an apostrophe, since its spelling and pronunciation fit into normal Polish rules; but Gore'a needs the apostrophe, because e disappears from the pronunciation, changing the inflection pattern. This rule is often misunderstood as calling for an apostrophe after all foreign words, regardless of their pronunciation, yielding the incorrect Kampania Al'a Gore'a, for example. The effect is akin to the greengrocers' apostrophe (see above).
  • In Turkish, proper nouns are capitalized and an apostrophe is inserted between the noun and any following suffix, e.g. İstanbul'da ("in Istanbul"), contrasting with okulda ("in school").
  • In Welsh the apostrophe is used with infixed pronouns in order to distinguish them from the preceding word (e.g. a'm chwaer "and my sister" as opposed to am chwaer "about a sister").

Miscellaneous uses in other languages[link]

  • In Slovak, the caron over lowercase t, d, l, and uppercase L consonants resembles an apostrophe: ď, ť, ľ, Ľ. This is especially so in certain common typographic renderings. But it is incorrect to use an apostrophe instead of the caron. In Slovak, there is also l with an acute accent: ĺ, Ĺ. In both languages the apostrophe is properly used only to indicate elision in certain words (tys', as an abbreviated form of ty si in Slovak, or pad' for padl in Czech); however, these elisions are restricted to poetry. And the apostrophe is also used before a two-digit year number (to indicate the omission of the first two digits): '87.
  • In Finnish, one of the consonant gradation patterns is the change of a k into a hiatus, e.g. keko → keon ("a pile → a pile's"). This hiatus has to be indicated in spelling with an apostrophe if a long vowel or a diphthong would be immediately followed by the final vowel, e.g. ruoko → ruo'on, vaaka → vaa'an. (This is in contrast to compound words, where the equivalent problem is solved with a hyphen, e.g. maa-ala, "land area".) Similarly, the apostrophe is used to mark the hiatus (contraction) that occurs in poetry, e.g. miss' on for missä on ("where is").
  • In Breton, the combination c'h is used for the consonant /x/ (like ch in English Loch Ness), while ch is used for the consonant /ʃ/ (as in French chat or English she).
  • In Italian, an apostrophe is sometimes used as a substitute for a grave or an acute accent after a final vowel: in capitals, or when the proper form of the letter is unavailable. So Niccolò might be rendered as Niccolo', or NICCOLO'; perché, as perche', or PERCHE'. This applies only to machine or computer writing, in the absence of a suitable keyboard. This usage is considered incorrect, or at least inelegant, by many.[who?]
  • In Swahili, an apostrophe after ng shows that there is no sound of /ɡ/ after the /ŋ/ sound; that is, that the ng is pronounced as in English singer, not as in English finger.
  • In Ganda, ng' (pronounced /ŋ/) is used in place of ŋ on keyboards where this character is not available. The apostrophe distinguishes it from the letter combination ng (pronounced [ŋɡ]), which has separate use in the language. Compare this with the Swahili usage above.
  • In Jèrriais, one of the uses of the apostrophe is to mark gemination, or consonant length. For example, t't represents /tː/, s's /sː/, n'n /nː/, th'th /ðː/, and ch'ch /ʃː/ (contrasted with /t/, /s/, /n/, /ð/, and /ʃ/).
  • In the pinyin (hànyǔ pīnyīn) system of romanization for Standard Chinese, an apostrophe is often loosely said to separate syllables in a word where ambiguity could arise. Example: the standard romanization for the name of the city Xī'ān includes an apostrophe to distinguish it from a single-syllable word xian. More strictly, however, it is correct to place an apostrophe only before every a, e, or o that starts a new syllable after the first if it is not preceded by a hyphen or a dash. Examples: Tiān'ānmén, Yǎ'ān; but simply Jǐnán, in which the syllables are ji and nan, since the absence of an apostrophe shows that the syllables are not jin and an (contrast Jīn'ān).[77] This is a kind of morpheme-separation marking (see above).
  • In the largely superseded Wade–Giles romanization for Standard Chinese, an apostrophe marks aspiration of the preceding consonant sound. Example: in tsê (pinyin ze) the consonant represented by ts is unaspirated, but in ts'ê (pinyin ce) the consonant represented by ts' is aspirated.
  • In some systems of romanization for the Japanese, the apostrophe is used between moras in ambiguous situations, to differentiate between, for example, na and n + a. (This is similar to the practice in Pinyin mentioned above.)
  • In Hebrew, the geresh (a diacritic similar to the apostrophe and often represented by one) is adjacent to letters to show sounds that are not represented in the Hebrew alphabet. Sounds such as j, th, and ch are indicated using ג, ת, and צ with a geresh (informally "chupchik"). For example, the name George is spelled ג׳ורג׳ in Hebrew (with ג׳ representing the first and last consonants).
  • In the new Uzbek Latin alphabet adopted in 2000, the apostrophe serves as a diacritical mark to distinguish different phonemes written with the same letter: it differentiates o' (corresponding to Cyrillic ў) from o, and g' (Cyrillic ғ) from g. This avoids the use of special characters, allowing Uzbek to be typed with ease in ordinary ASCII on any Latin keyboard. In addition, a postvocalic apostrophe in Uzbek represents the glottal stop phoneme derived from Arabic hamzah or ‘ayn, replacing Cyrillic ъ.
  • In English Yorkshire dialect, the apostrophe is used to represent the word the, which is contracted to a more glottal (or "unreleased") /t/ sound. Most users will write in t'barn ("in the barn"), on t'step ("on the step"); and those unfamiliar with Yorkshire speech will often make these sound like intuh barn and ontuh step. A more accurate rendition might be in't barn and on't step, though even this does not truly convey correct Yorkshire pronunciation as the t is more like a glottal stop.
  • Galician restaurants in Madrid in Páginas Amarillas sometimes use O' in their names instead of the standard article O ("The").[78]
  • In standard lojban orthography, it is a letter in its own right (called y'y [ɐhɐ]) that can appear only between two vowels, and is phonemically realized as either [h] or, more rarely, [θ].

Typographic form[link]

The form of the apostrophe originates in manuscript writing, as a point with a downwards tail curving clockwise. This form was inherited by the typographic apostrophe ), also known as the typeset apostrophe, or, informally, the curly apostrophe. Later sans-serif typefaces had stylized apostrophes with a more geometric or simplified form, but usually retaining the same directional bias as a closing quotation mark.

With the invention of the typewriter, a "neutral" quotation mark form ( ' ) was created to economize on the keyboard, by using a single key to represent: the apostrophe, both opening and closing single quotation marks, single primes, and on some typewriters the exclamation point by overprinting with a period. This is known as the typewriter apostrophe or vertical apostrophe. The same convention was adopted for quotation marks.

Both simplifications carried over to computer keyboards and the ASCII character set. However, although these are widely used due to their ubiquity and convenience, they are deprecated in contexts where proper typography is important.[79]

Computing[link]

ASCII encoding[link]

The typewriter apostrophe ( ' ) was inherited by computer keyboards, and is the only apostrophe character available in the (7-bit) ASCII character encoding, at code value 0x27 (39). As such, it is a highly overloaded character. In ASCII, it represents a right single quotation mark, left single quotation mark, apostrophe, vertical line or prime (punctuation marks), or an acute accent (modifier letters).

Many earlier (pre 1985) computer displays and printers rendered the ASCII apostrophe as a typographic apostrophe, and rendered the ASCII grave accent` ) U+0060 as a matching left single quotation mark. This allowed a more typographic appearance of text: ``I can't'' would appear as ‘‘I can’t’’ on these systems. This can still be seen in many documents prepared at that time, and is still used in the TeX typesetting system to create typographic quotes.

Typographic apostrophe in 8-bit encodings[link]

Support for the typographic apostrophe (  ) was introduced in a variety of 8-bit character encodings, such as the Apple Macintosh operating system's Mac Roman character set (in 1984), and later in the CP1252 encoding of Microsoft Windows. There is no such character in ISO-8859-1.

Microsoft Windows CP1252 (sometimes incorrectly called ANSI or ISO-Latin) contains the typographic apostrophe at 0x92. Due to "smart quotes" in Microsoft software converting the ASCII apostrophe to this value, other software makers have been forced to adopt this as a de facto convention. For instance the HTML 5 standard specifies that this value is interpreted as CP1252. Some earlier non-Microsoft browsers would display a '?' for this and make web pages composed with Microsoft software somewhat hard to read.

Unicode[link]

There are several types of apostrophe character in Unicode:

  • ( ' ) Vertical typewriter apostrophe (Unicode name apostrophe or apostrophe-quote), U+0027, inherited from ASCII.
  •  ) Punctuation apostrophe (or typographic apostrophe; right single quotation mark; single comma quotation mark), U+2019. Serves as both an apostrophe and closing single quotation mark. This is the preferred character to use for apostrophe according to the Unicode standard.[80]
  • ʼ ) Letter apostrophe (or modifier letter apostrophe), U+02BC. This is preferred when the apostrophe is not considered punctuation that separates letters, but a letter in its own right. Examples occur in Breton cʼh, the Cyrillic Azerbaijani alphabet, or in some transliterations such as the transliterated Arabic glottal stop, hamza, or transliterated Cyrillic soft sign. As the letter apostrophe is seldom used in practice, the Unicode standard cautions that one should never assume text is coded thus. The letter apostrophe is rendered identically to the punctuation apostrophe in the Unicode code charts.[81]
  • ʻ ) The Hawaiian glottal stop, the ʻokina, has its own Unicode character, U+02BB.
  • ˮ ) Letter double apostrophe (Unicode name modifier letter double apostrophe), U+02EE. One of two characters for glottal stop in Nenets.
  • ՚ ) Armenian apostrophe, U+055A.
  • Ꞌ ꞌ ) The Me'phaa glottal stop (and other languages of Mexico), the Saltillo, has its own Unicode characters, U+A78B and U+A78C.

Entering apostrophes[link]

Although ubiquitous in typeset material, the typographic apostrophe (  ) is rather difficult to enter on a computer, since it does not have its own key on a standard keyboard. Outside the world of professional typesetting and graphic design, many people do not know how to enter this character and instead use the typewriter apostrophe ( ' ). The typewriter apostrophe has always been considered tolerable on Web pages because of the egalitarian nature of Web publishing and the low resolution of computer monitors in comparison to print.

More recently, the correct use of the typographic apostrophe is becoming more common on the Web due to the wide adoption of the Unicode text encoding standard, higher-resolution displays, and advanced anti-aliasing of text in modern operating systems. Because typewriter apostrophes are now often automatically converted to typographic apostrophes by wordprocessing and desktop-publishing software (see below), the typographic apostrophe does often appear in documents produced by non-professionals.

How to enter typographic apostrophes on a computer
Unicode (Decimal) Macintosh Windows-1252 Alt code Linux/X HTML entity
U+2019 8217 Option + Shift + ] Alt + 0146 on number pad AltGr + shift + B or Compose ' > &rsquo;

XML (and hence XHTML) defines an &apos; character entity reference for the ASCII typewriter apostrophe. No equivalent entity is defined in the HTML 4 standard,[82] despite all the other predefined character entities from XML being defined in HTML. If it cannot be entered literally in HTML, a numeric character reference could be used instead, such as "&#x27;" or "&#39;".

Smart quotes[link]

To make typographic apostrophes easier to enter, wordprocessing and publishing software often converts typewriter apostrophes to typographic apostrophes during text entry (at the same time converting opening and closing single and double quotes to their correct left-handed or right-handed forms). A similar facility may be offered on web servers after submitting text in a form field, e.g. on weblogs or free encyclopedias. This is known as the smart quotes feature; apostrophes and quotation marks that are not automatically altered by computer programs are known as dumb quotes.

Such conversion is not always done in accordance with the standards for character sets and encodings. Additionally, many such software programs incorrectly convert a leading apostrophe to an opening quotation mark (e.g., in abbreviations of years: 29 rather than the correct 29 for the years 1929 or 2029 (depending on context); or twas instead of twas as the archaic abbreviation of it was. Smart quote features also often fail to recognise situations when a prime rather than an apostrophe is needed; for example, incorrectly rendering the latitude 49° 53′ 08″ as 49° 53 08.

In Microsoft Word it is possible to turn smart quotes off (in some versions, by navigating through Tools, AutoCorrect, AutoFormat as you type, and then unchecking the appropriate option). Alternatively, typing Control-Z (for Undo) immediately after entering the apostrophe will convert it back to a typewriter apostrophe. In Microsoft Word for Windows, holding down the Control key while typing two apostrophes will produce a single typographic apostrophe.

Programming[link]

Some programming languages, like Pascal, use the ASCII apostrophe to delimit string constants. Often either the apostrophe or the double quote may be used, allowing string constants to contain the other character (but not to contain both without using an escape character).

The C programming language (and many related languages like C++ or Java) uses apostrophes to delimit a character constant. In C it is seen as a character value only, different from a 1-letter string.

In Visual Basic an apostrophe is used to denote the start of a comment.

See also[link]

References[link]

  1. ^ Quirk, Geenbaum, Leech & Svartvik, A Comprehensive Grammar of the English Language, p985, Longman, London & New York, ISBN 0-582-51734-6, p 1636
  2. ^ "The English form apostrophe is due to its adoption via French and its current pronunciation as four syllables is due to a confusion with the rhetorical device apostrophé" (W. S. Allen, Vox Graeca. The pronunciation of classical Greek, 3rd edition, 1988. Cambridge University Press, Cambridge, p. 100, note 13).
  3. ^ a b c Crystal, David (2003). The Cambridge Encyclopedia of the English Language, Second Edition. Cambridge University Press. p. 203. ISBN 0-521-53033-4. 
  4. ^ Alfred Ewert, The French Language, 1933, Faber & Faber, London, p 119
  5. ^ Pease as an old plural of pea is indeterminate: Lentils' and pease'[s] use in such dishes was optional. Nouns borrowed from French ending in -eau, -eu, -au, or -ou sometimes have alternative plurals that retain the French -x: beaux or beaus; bureaux or bureaus; adieux or adieus; fabliaux or fabliaus; choux or chous. The x in these plurals is often pronounced. If it is, then (in the absence of specific rulings from style guides) the plural possessives are formed with an apostrophe alone: the beaux' [or beaus'] appearance at the ball; the bureaux' [or bureaus'] responses differed. If the x is not pronounced, then in the absence of special rulings the plurals are formed with an apostrophe followed by an s: the beaux's appearance; the bureaux's responses; their adieux's effect was that everyone wept. See also Nouns ending with silent "s", "x" or "z", below, and attached notes.
  6. ^ Style Guide, US Department of Justice, Bureau of Justice Statistics; The United States Government Printing Office Style Manual 2000; The Chicago Manual of Style (CMOS), 5.25: "The possessive of a multiword compound noun is formed by adding the appropriate ending to the last word {parents-in-law's message}."
  7. ^ CMOS, 7.25: "If plural compounds pose problems, opt for of. ... the professions of both my daughters-in-law."
  8. ^ Is the English Possessive 's Truly a Right-Hand Phenomenon?[dead link]
  9. ^ The Chicago Manual of Style, 5.27; New Hart's Rules, §4.2, p. 64; Gregg Reference Manual, §642.
  10. ^ This example is quoted from www.abc.net.au[dead link]; see The Chicago Manual of Style, 7.18.
  11. ^ This example is quoted from The Gregg Reference Manual, 10th edition, 2005, paragraph 641.
  12. ^ This is normal despite the fact that the single word hers is spelled without an apostrophe, see below in this section; His 'n' Hers's first track is theoretically possible but unlikely unless an extra sibilant is actually pronounced after hrs.[citation needed]
  13. ^ Most sources[who?] are against continuing the italics used in such titles to the apostrophe and the s.
  14. ^ Online Etymology Dictionary
  15. ^ See for example New Hart's Rules. Not one of the other sources listed on this page supports the use of it's as a possessive form of it.
  16. ^ Frank Bergon,"The Journals of Lewis & Clark",(Penguin, New York, 1989, pages xxiv foll.
  17. ^ Courier Mail, Little things that matter
  18. ^ Oxford Dictionaries: "With personal names that end in -s: add an apostrophe plus s when you would naturally pronounce an extra s if you said the word out loud"; MLA Style Manual, 2nd edition, 1998, 3.4.7e: "To form the possessive of any singular proper noun, add an apostrophe and an s"; [1]: "Grammarians (such as Hart, Fowler, Swan and Lynne Truss) and other authorities, such as the Guardian and Economist styleguides, agree that the -'s form should follow all singular nouns, regardless of whether they end in an -s or not.";The Economist's Style Guide; The Elements of Style makes the same rule, with only sketchily presented exceptions.
  19. ^ The Guardian's Style Guide.
  20. ^ [2]: "For most singular nouns, add an apostrophe and an s (’s) to the end of the word... For names that end with an eez sound, use an apostrophe alone to form the possessive. Examples: "Ramses’ wife," "Hercules’ muscles," "According to Jones’s review, the computer’s graphics card is its Achilles’ heel."
  21. ^ The American Heritage Book of English Usage. 8. Word Formation b. Forming Possessives.
  22. ^ The Times Online Style Guide.
  23. ^ Vanderbilt University's Style Guide.
  24. ^ According to this older system, possessives of names ending in "-x" or "-xe" were usually spelled without a final "s" even when an /s/ or /z/ was pronounced at the end (e.g. "Alex' brother" instead of "Alex's brother"), but the possessives of nouns (e.g. "the fox's fur") were usually spelled as today with a final "s".
  25. ^ Punctuation | Style Guide | CSU Branding Standards Guide | CSU
  26. ^ The Chicago Manual of Style's text: 7.23 An alternative practice. Those uncomfortable with the rules, exceptions, and options outlined above may prefer the system, formerly more common, of simply omitting the possessive s on all words ending in s – hence "Dylan Thomas' poetry," "Maria Callas' singing," and "that business' main concern." Though easy to apply, that usage disregards pronunciation and thus seems unnatural to many.
  27. ^ Chicago Style Q&A: Possessives and Attributives
  28. ^ "DummiesWorld Wide Words". http://www.worldwidewords.org/qa/qa-app2.htm. Retrieved 13 March 2007. . The Chicago Manual of Style, 7.22: "For...sake expressions traditionally omit the s when the noun ends in an s or an s sound." Oxford Style Manual, 5.2.1: "Use an apostrophe alone after singular nouns ending in an s or z sound and combined with sake: for goodness' sake".
  29. ^ "Practice varies widely in for conscience' sake and for goodness' sake, and the use of an apostrophe in them must be regarded as optional" The New Fowler's Modern English Usage, ed. Burchfield, RW, 3rd edition, 1996, entry for "sake", p. 686.
  30. ^ Starble, Jonathan M. (9 October 2006). Gimme an S: The Robert Court splits over grammar. Legal Times Last accessed 17 December 2011.
  31. ^ In February 2007 Arkansas historian Parker Westbrook successfully petitioned State Representative Steve Harrelson to settle once and for all that the correct possessive should not be Arkansas' but Arkansas's (Arkansas House to argue over apostrophes). Arkansas's Apostrophe Act came into law in March 2007 (ABC News [USA], 6 March 2007).
  32. ^ An apparent exception is The Complete Stylist, Sheridan Baker, 2nd edition 1972, p. 165: "...citizens' rights, the Joneses' possessions, and similarly The Beaux' Stratagem." But in fact the x in beaux, as in other such plurals in English, is often already pronounced (see a note to Basic rule (plural nouns), above); The Beaux Stratagem, the title of a play by George Farquhar (1707), originally lacked the apostrophe (see the title page of a 1752 edition); and it is complicated by the following s in stratagem. Some modern editions add the apostrophe (some with an s also), some omit it; and some make a compound with a hyphen: The Beaux-Stratagem. Farquhar himself used the apostrophe elsewhere in the standard ways, for both omission and possession.
  33. ^ Jacqueline Letzter, Intellectual Tacking: Questions of Education in the Works of Isabelle de Charrière, Rodopi, 1998, p. 123.
  34. ^ Elizabeth A. McAlister, Rara!: Vodou, Power, and Performance in Haiti and Its Diaspora, University of California Press, 2002, p. 196.
  35. ^ a b "Apostrophe Cops: Don't Be So Possessive". The New York Times (Sunday Magazine). 10 March 1996. http://www.nytimes.com/1996/03/10/magazine/sunday-march-10-1996-apostrophe-cops-don-t-be-so-possessive.html. 
  36. ^ a b U.S. Board on Geographic Names: FAQs
  37. ^ Cavella, C, and Kernodle, RA, "How the Past Affects the Future: the Story of the Apostrophe
  38. ^ St James's Church Piccadilly website
  39. ^ "The apostrophe has been dropped from most Australian place-names and street names: Connells Point; Wilsons Promontory; Browns Lane." The Penguin Working Words: an Australian Guide to Modern English Usage, Penguin, 1993, p. 41.
  40. ^ E.g., under Naming conventions in Active Directory for computers, domains, sites, and OUs at Microsoft Support
  41. ^ The Cambridge Guide to English Usage, Ed. Peters, P, 2004, p. 43.
  42. ^ International Aviation Womens Association
  43. ^ Spelled both with and without the apostrophe at the court's own home page; but spelled with the apostrophe in Victorian legislation, such as Magistrates' Court Act, 1989.
  44. ^ Gregg Reference Manual, 10th edition, 2003, distinguishes between what it calls possessive and descriptive forms, and uses this distinction in analyzing the problem. From paragraph 628: "a. Do not mistake a descriptive form ending in s for a possessive form[:] sales effort (sales describes the kind of effort)... b. Some cases can be difficult to distinguish. Is it the girls basketball team or the girls' basketball team? Try substituting an irregular plural like women. You would not say the women basketball team; you would say the women's basketball team. By analogy, the girls' basketball team is correct" [italics given exactly as in original, including following punctuation]. And then this principle is applied to organizations at paragraph 640, where examples are given, including the non-conforming Childrens Hospital, (in Los Angeles): "The names of many organizations, products, and publications contain words that could be considered either possessive or descriptive terms... c. In all cases follow the organization's preference when known."
  45. ^ Apostrophe Protection Society's website.
  46. ^ Times Online: Harrods told to put its apostrophe back.
  47. ^ In reports of very informal speech 's may sometimes represent does: "Where's that come from?"
  48. ^ SOED gives fo'c's'le as the only shortened form of forecastle, though others are shown in OED. SOED gives bo's'n as one spelling of bosun, itself a variant of boatswain.
  49. ^ a b c d "Purdue University Online Writing Lab: The Apostrophe". http://owl.english.purdue.edu/handouts/grammar/g_apost.html. Retrieved 13 March 2007. 
  50. ^ Guide to Punctuation, Larry Trask, University of Sussex: "American usage, however, does put an apostrophe here: (A) This research was carried out in 1970s."
  51. ^ "M‘Culloch and the Turned Comma". http://www.greenbag.org/v12n3/v12n3_collins.pdf. Retrieved 14 March 2012. 
  52. ^ Truss, Lynne. Eats, Shoots & Leaves. p. 41, pp. 48–54.
  53. ^ a b Half of Britons struggle with the apostrophe, The Daily Telegraph, 11 November 2008
  54. ^ "In praise of apostrophes", BBC News, 5 October 2001
  55. ^ 'Fatal floors' in exam scripts, BBC News, 3 November 2004
  56. ^ Word Spy - greengrocers' apostrophe
  57. ^ "Style guide". The Guardian (London). 16 December 2008. http://www.guardian.co.uk/styleguide/a#id-3016449. 
  58. ^ Truss, Lynne. Eats, Shoots & Leaves. pp. 63–65.
  59. ^ Christina Cavella and Robin A. Kernodle (PDF). How the Past Affects the Future: The Story of the Apostrophe. American University. Archived from the original on 2009-03-26. http://web.archive.org/web/20090326014513/http://www.american.edu/tesol/wpkernodlecavella.pdf. Retrieved 26 October 2006. 
  60. ^ Burrough-Boenisch, Joy (2004). "Dutch Greengrocers". Righting English That's Gone Dutch (2nd ed.). Kemper Conseil Publishing. pp. 39–40. ISBN 978-90-76542-08-9 
  61. ^ Titan Go King's, at nippop.com.
  62. ^ A search on www.multimpap.com for "St Johns Lane" in the UK, with or without apostrophe, finds the apostrophe omitted in 5 instances out of 25
  63. ^ Fernandez, Colin, 'Punctuation hero' branded a vandal for painting apostrophes on street signs, The Daily Mail, accessed 19 August 2009
  64. ^ Bill Bryson, "Troublesome Words," Penguin, second edition 1987, p. 177
  65. ^ W. W. Norton & Company
  66. ^ The apostrophe
  67. ^ Eats, Shoots & Leaves.
  68. ^ a b Nordquist, Richard (29 October 2008). "The Long Campaign to Abolish the Apostrophe". About.com. http://grammar.about.com/b/2008/10/29/the-long-campaign-to-abolish-the-apostrophe.htm. Retrieved 1 May 2011. 
  69. ^ Brodie, Peter (November 1996). "Never Say NEVER: Teaching Grammar and Usage". The English Journal (National Council of Teachers of English) 85 (7): 78. JSTOR 820514. 
  70. ^ Afrikaanse Woordelys en Spelreëls (9th ed.). Cape Town, South Africa: Pharos Woordeboeke. 2002. ISBN 1-86890-034-7. http://www.nb.co.za/product/afrikaanse-woordelys-en-spelre-ls----de-uitgawe/7390/. 
  71. ^ In early French such elisions did occur: m'espée (ma +espée, modern French mon épée: "my sword"), s'enfance (sa +enfance, son enfance: "his or her childhood"). But the only modern survivals of this elision with apostrophe are m'amie and m'amour, as archaic and idiomatic alternatives to mon amie and mon amour ("my [female] friend", "my love"); forms without the apostrophe also used: mamie or ma mie, mamour.
  72. ^ Examples include Nuestras vidas son los ríos / que van a dar en la mar, / qu'es el morir. meaning "Our lives are the rivers / that flow to give to the sea, / which is death." (from Coplas de Don Jorge Manrique por la muerte de su padre, 1477) and ¿ ... qué me ha de aprovechar ver la pintura / d'aquel que con las alas derretidas ...? meaning "... what could it help me to see the painting of that one with the melted wings ...?" (from the 12th sonnet of Garcilazo de la Vega, c. 1500–1536).
  73. ^ Language Construction Kit, refers to the common phenomenon of adding apostrophes to make names appear "alien"
  74. ^ Daniel Bunčić (Bonn), "The apostrophe: A neglected and misunderstood reading aid" at the Tübingen University website
  75. ^ Linguist List 13.1566, Daniel Bunčić, "Apostrophe rules in languages", from 31 May 2002.
  76. ^ ГТРК – Владимир :: Главная (Russian)
  77. ^ Pinyin
  78. ^ restaurants with an 'o' in their names in Madrid.
  79. ^ Apostrophe Atrophy
  80. ^ The Unicode Consortium
  81. ^ Unicode code charts
  82. ^ "Character entity references in HTML 4". World Wide Web Consortium. 24 December 1999. http://www.w3.org/TR/1999/REC-html401-19991224/sgml/entities.html. Retrieved 15 October 2011. 

Further reading[link]

External links[link]

http://wn.com/Apostrophe




This page contains text from Wikipedia, the Free Encyclopedia - http://en.wikipedia.org/wiki/Apostrophe

This article is licensed under the Creative Commons Attribution-ShareAlike 3.0 Unported License, which means that you can copy and modify it as long as the entire work (including additions) remains under this license.


Text file created with gedit and viewed with a hex editor
Besides the text objects there are only EOL markers
with the hexadecimal value 0A.

In computing, a newline,[1] also known as a line break or end-of-line (EOL) marker, is a special character or sequence of characters signifying the end of a line of text. The name comes from the fact that the next character after the newline will appear on a new line—that is, on the next line below the text immediately preceding the newline. The actual codes representing a newline vary across operating systems, which can be a problem when exchanging text files between systems with different newline representations.

There is also some confusion whether newlines terminate or separate lines. If a newline is considered a separator, there will be no newline after the last line of a file. The general convention on most systems is to add a newline even after the last line, i.e. to treat newline as a line terminator. Some programs have problems processing the last line of a file if it is not newline terminated. Conversely, programs that expect newline to be used as a separator will interpret a final newline as starting a new (empty) line.

In text intended primarily to be read by humans using software which implements the word wrap feature, a newline character typically only needs to be stored if a line break is required independent of whether the next word would fit on the same line, such as between paragraphs and in vertical lists. See hard return and soft return.

Contents

Representations[link]

Software applications and operating systems usually represent a newline with one or two control characters:

  • Systems based on ASCII or a compatible character set use either LF (Line feed, '\n', 0x0A, 10 in decimal) or CR (Carriage return, '\r', 0x0D, 13 in decimal) individually, or CR followed by LF (CR+LF, '\r\n', 0x0D0A). These characters are based on printer commands: The line feed indicated that one line of paper should feed out of the printer thus instructed the printer to advance the paper one line, and a carriage return indicated that the printer carriage should return to the beginning of the current line. Some rare systems, such as QNX before version 4, used the ASCII RS (record separator, 0x1E, 30 in decimal) character as the newline character.
  • EBCDIC systems—mainly IBM mainframe systems, including z/OS (OS/390) and i5/OS (OS/400)—use NEL (Next Line, 0x15) as the newline character. Note that EBCDIC also has control characters called CR and LF, but the numerical value of LF (0x25) differs from the one used by ASCII (0x0A). Additionally, there are some EBCDIC variants that also use NEL but assign a different numeric code to the character.
  • Operating systems for the CDC 6000 series defined a newline as two or more zero-valued six-bit characters at the end of a 60-bit word. Some configurations also defined a zero-valued character as a colon character, with the result that multiple colons could be interpreted as a newline depending on position.
  • ZX80 and ZX81, home computers from Sinclair Research Ltd used a specific non-ASCII character set with code NEWLINE (0x76, 118 decimal) as the newline character.
  • OpenVMS uses a record-based file system, which stores text files as one record per line. In most file formats, no line terminators are actually stored, but the Record Management Services facility can transparently add a terminator to each line when it is retrieved by an application. The records themselves could contain the same line terminator characters, which could either be considered a feature or a nuisance depending on the application.
  • Fixed line length was used by some early mainframe operating systems. In such a system, an implicit end-of-line was assumed every 80 characters, for example. No newline character was stored. If a file was imported from the outside world, lines shorter than the line length had to be padded with spaces, while lines longer than the line length had to be truncated. This mimicked the use of punched cards, on which each line was stored on a separate card, usually with 80 columns on each card. Many of these systems added an carriage control character to the start of the next record, this could indicate if the next record was a continuation of the line started by the previous record, or a new line, or should overprint the previous line (similar to a CR). Often this was a normal printing character such as '#' that thus could not be used as the first character in a line. Some early line printers interpreted these characters directly in the records sent to them.

Most textual Internet protocols (including HTTP, SMTP, FTP, IRC and many others) mandate the use of ASCII CR+LF (0x0D 0x0A) on the protocol level, but recommend that tolerant applications recognize lone LF as well. In practice, there are many applications that erroneously use the C newline character '\n' instead (see section Newline in programming languages below). This leads to problems when trying to communicate with systems adhering to a stricter interpretation of the standards; one such system is the qmail MTA that actively refuses to accept messages from systems that send bare LF instead of the required CR+LF.[2]

FTP has a feature to transform newlines between CR+LF and LF only when transferring text files. This must not be used on binary files. Usually binary files and text files are recognised by checking their filename extension.

Unicode[link]

The Unicode standard defines a large number of characters that conforming applications should recognize as line terminators:[3]

 LF:    Line Feed, U+000A
 VT:    Vertical Tab, U+000B
 FF:    Form Feed, U+000C
 CR:    Carriage Return, U+000D
 CR+LF: CR (U+000D) followed by LF (U+000A)
 NEL:   Next Line, U+0085
 LS:    Line Separator, U+2028
 PS:    Paragraph Separator, U+2029

This may seem overly complicated compared to an approach such as converting all line terminators to a single character, for example LF. However, Unicode was designed to preserve all information when converting a text file from any existing encoding to Unicode and back. Therefore, Unicode should contain characters included in existing encodings. NEL is included in ISO-8859-1[citation needed] and EBCDIC (0x15). The approach taken in the Unicode standard allows round-trip transformation to be information-preserving while still enabling applications to recognize all possible types of line terminators.

Recognizing and using the newline codes greater than 0x7F is not often done. They are multiple bytes in UTF-8 and the code for NEL has been used as the ellipsis ('…') character in Windows-1252. For instance:

  • YAML[4] no longer recognizes them as special in order to be compatible with JSON.
  • ECMAScript[5] accepts LS and PS as line breaks, but considers U+0085 (NEL) white space, not a line break.
  • Microsoft Windows 2000 does not treat any of NEL, LS or PS as line-break in the default text editor Notepad
  • In Linux, a popular editor "gedit" treats LS and PS as newlines but does not for NEL.

History[link]

ASCII was developed simultaneously by the ISO and the ASA, the predecessor organization to ANSI. During the period of 1963–1968, the ISO draft standards supported the use of either CR+LF or LF alone as a newline, while the ASA drafts supported only CR+LF.

The sequence CR+LF was in common use on many early computer systems that had adopted Teletype machines, typically a Teletype Model 33 ASR, as a console device, because this sequence was required to position those printers at the start of a new line. On these systems, text was often routinely composed to be compatible with these printers, since the concept of device drivers hiding such hardware details from the application was not yet well developed; applications had to talk directly to the Teletype machine and follow its conventions.

Most minicomputer systems from DEC used this convention. CP/M used it as well, to print on the same terminals that minicomputers used. From there MS-DOS (1981) adopted CP/M's CR+LF in order to be compatible, and this convention was inherited by Microsoft's later Windows operating system.

The separation of the two functions concealed the fact that the print head could not return from the far right to the beginning of the next line in one-character time. That is why the sequence was always sent with the CR first. In fact, it was often necessary to send extra characters (extraneous CRs or NULs, which are ignored) to give the print head time to move to the left margin. Even many early video displays required multiple character times to scroll the display.

The Multics operating system began development in 1964 and used LF alone as its newline. Multics used a device driver to translate this character to whatever sequence a printer needed (including extra padding characters), and the single byte was much more convenient for programming. The seemingly more obvious choice of CR was not used, as a plain CR provided the useful function of overprinting one line with another, and thus it was useful to not translate it. Unix followed the Multics practice, and later systems followed Unix.

In programming languages[link]

To facilitate the creation of portable programs, programming languages provide some abstractions to deal with the different types of newline sequences used in different environments.

The C programming language provides the escape sequences '\n' (newline) and '\r' (carriage return). However, these are not required to be equivalent to the ASCII LF and CR control characters. The C standard only guarantees two things:

  1. Each of these escape sequences maps to a unique implementation-defined number that can be stored in a single char value.
  2. When writing a file in text mode, '\n' is transparently translated to the native newline sequence used by the system, which may be longer than one character. When reading in text mode, the native newline sequence is translated back to '\n'. In binary mode, no translation is performed, and the internal representation produced by '\n' is output directly.

On Unix platforms, where C originated, the native newline sequence is ASCII LF (0x0A), so '\n' was simply defined to be that value. With the internal and external representation being identical, the translation performed in text mode is a no-op, and text mode and binary mode behave the same. This has caused many programmers who developed their software on Unix systems simply to ignore the distinction completely, resulting in code that is not portable to different platforms.

The C library function fgets() is best avoided in binary mode because any file not written with the UNIX newline convention will be misread. Also, in text mode, any file not written with the system's native newline sequence (such as a file created on a UNIX system, then copied to a Windows system) will be misread as well.

Another common problem is the use of '\n' when communicating using an Internet protocol that mandates the use of ASCII CR+LF for ending lines. Writing '\n' to a text mode stream works correctly on Windows systems, but produces only LF on Unix, and something completely different on more exotic systems. Using "\r\n" in binary mode is slightly better.

Many languages, such as C++, Perl,[6] and Haskell provide the same interpretation of '\n' as C.

Java, PHP,[7] and Python[8] provide the '\r\n' sequence (for ASCII CR+LF). In contrast to C, these are guaranteed to represent the values U+000A and U+000D, respectively.

The Java I/O libraries do not transparently translate these into platform-dependent newline sequences on input or output. Instead, they provide functions for writing a full line that automatically add the native newline sequence, and functions for reading lines that accept any of CR, LF, or CR+LF as a line terminator (see BufferedReader.readLine()). The System.getProperty() method can be used to retrieve the underlying line separator.

Example:

  String eol = System.getProperty( "line.separator" );
  String lineColor = "Color: Red" + eol;

Python permits "Universal Newline Support" when opening a file for reading, when importing modules, and when executing a file.[9]

Some languages have created special variables, constants, and subroutines to facilitate newlines during program execution.

Common problems[link]

The different newline conventions often cause text files that have been transferred between systems of different types to be displayed incorrectly. For example, files originating on Unix or Apple Macintosh systems may appear as a single long line on some Windows programs. Conversely, when viewing a file originating from a Windows computer on a Unix system, the extra CR may be displayed as ^M at the end of each line or as a second line break.

The problem can be hard to spot if some programs handle the foreign newlines properly while others do not. For example, a compiler may fail with obscure syntax errors even though the source file looks correct when displayed on the console or in an editor. On a Unix system, the command cat -v myfile.txt will send the file to stdout (normally the terminal) and make the ^M visible, which can be useful for debugging. Modern text editors generally recognize all flavours of CR / LF newlines and allow the user to convert between the different standards. Web browsers are usually also capable of displaying text files and websites which use different types of newlines.

The File Transfer Protocol can automatically convert newlines in files being transferred between systems with different newline representations when the transfer is done in "ASCII mode". However, transferring binary files in this mode usually has disastrous results: Any occurrence of the newline byte sequence—which does not have line terminator semantics in this context, but is just part of a normal sequence of bytes—will be translated to whatever newline representation the other system uses, effectively corrupting the file. FTP clients often employ some heuristics (for example, inspection of filename extensions) to automatically select either binary or ASCII mode, but in the end it is up to the user to make sure his or her files are transferred in the correct mode. If there is any doubt as to the correct mode, binary mode should be used, as then no files will be altered by FTP, though they may display incorrectly.

Conversion utilities[link]

Text editors are often used for converting a text file between different newline formats; most modern editors can read and write files using at least the different ASCII CR/LF conventions. The standard Windows editor Notepad is not one of them (although Wordpad and the MS-DOS Editor are).

Editors are often unsuitable for converting larger files. For larger files (on Windows NT/2000/XP) the following command is often used:

TYPE unix_file | FIND "" /V > dos_file

On many Unix systems, the dos2unix (sometimes named fromdos or d2u) and unix2dos (sometimes named todos or u2d) utilities are used to translate between ASCII CR+LF (DOS/Windows) and LF (Unix) newlines. Different versions of these commands vary slightly in their syntax. However, the tr command is available on virtually every Unix-like system and is used to perform arbitrary replacement operations on single characters. A DOS/Windows text file can be converted to Unix format by simply removing all ASCII CR characters with

tr -d '\r' < inputfile > outputfile

or, if the text has only CR newlines, by converting all CR newlines to LF with

tr '\r' '\n' < inputfile > outputfile

The same tasks are sometimes performed with awk, sed, Tr_(Unix) or in Perl if the platform has a Perl interpreter:

awk '{sub("$","\r\n"); printf("%s",$0);}' inputfile > outputfile  # UNIX to DOS  (adding CRs on Linux and BSD based OS that haven't GNU extensions)
awk '{gsub("\r",""); print;}' inputfile > outputfile              # DOS to UNIX  (removing CRs on Linux and BSD based OS that haven't GNU extensions)
sed -e 's/$/\r/' inputfile > outputfile              # UNIX to DOS  (adding CRs on Linux based OS that use GNU extensions)
sed -e 's/\r$//' inputfile > outputfile              # DOS  to UNIX (removing CRs on Linux based OS that use GNU extensions)
cat inputfile | tr -d "\r" > outputfile              # DOS  to UNIX (removing CRs using tr(1). Not Unicode compliant.)
perl -pe 's/\r?\n|\r/\r\n/g' inputfile > outputfile  # Convert to DOS
perl -pe 's/\r?\n|\r/\n/g'   inputfile > outputfile  # Convert to UNIX
perl -pe 's/\r?\n|\r/\r/g'   inputfile > outputfile  # Convert to old Mac

To identify what type of line breaks a text file contains, the file command can be used. Moreover, the editor Vim can be convenient to make a file compatible with the Windows notepad text editor. For example:

[prompt] > file myfile.txt
myfile.txt: ASCII English text
[prompt] > vim myfile.txt
  within vim :set fileformat=dos
             :wq
[prompt] > file myfile.txt
myfile.txt: ASCII English text, with CRLF line terminators

The following grep commands echo the filename (in this case myfile.txt) to the command line if the file is of the specified style:

grep -PL $'\r\n' myfile.txt # show UNIX style file (LF terminated)
grep -Pl $'\r\n' myfile.txt # show DOS style file (CRLF terminated)

For Debian-based systems, these commands are used:

egrep -L $'\r\n' myfile.txt # show UNIX style file (LF terminated)
egrep -l $'\r\n' myfile.txt # show DOS style file (CRLF terminated)

The above grep commands work under Unix systems or in Cygwin under Windows. Note that these commands make some assumptions about the kinds of files that exist on the system (specifically it's assuming only UNIX and DOS-style files—no Mac OS 9-style files).

This technique is often combined with find to list files recursively. For instance, the following command checks all "regular files" (e.g. it will exclude directories, symbolic links, etc.) to find all UNIX-style files in a directory tree, starting from the current directory (.), and saves the results in file unix_files.txt, overwriting it if the file already exists:

find . -type f -exec grep -PL '\r\n' {} \; > unix_files.txt

This example will find C files and convert them to LF style line endings:

find -name '*.[ch]' -exec fromdos {} \;

The file command also detects the type of EOL used:

file myfile.txt
> myfile.txt: ASCII text, with CRLF line terminators

Other tools permit the user to visualise the EOL characters:

od -a myfile.txt
cat -e myfile.txt
hexdump -c myfile.txt

dos2unix, unix2dos, mac2unix, unix2mac, mac2dos, dos2mac can perform conversions. The flip[10] command is often used.

See also[link]

References[link]

  1. ^ The origin of the older computer term "CRLF" - which redirects to this Newline article - or "Carriage Return [and] Line Feed", derives from standard manual typewriter design, whereby at the end of a line of text the typist pushes a lever at the left end of the carriage to return it to position for beginning the next line. In so doing, a mechanism also rolls the typewriter's platen by one line, advancing ("feeding") the paper to the correct position.
  2. ^ cr.yp.to
  3. ^ UTR #13: Unicode Newline Guidelines
  4. ^ YAML Ain't Markup Language (YAML™) Version 1.2
  5. ^ "ECMAScript Language Specification 5th edition". ECMA International. December 2009. p. 15. http://www.ecma-international.org/publications/files/ECMA-ST/ECMA-262.pdf. Retrieved 4 April 2010. 
  6. ^ binmode - perldoc.perl.org
  7. ^ PHP: Strings - Manual
  8. ^ Lexical analysis – Python v3.0.1 documentation
  9. ^ What's new in Python 2.3
  10. ^ ASCII text converstion between UNIX, Macintosh, MS-DOS

External links[link]

http://wn.com/Newline




This page contains text from Wikipedia, the Free Encyclopedia - http://en.wikipedia.org/wiki/Newline

This article is licensed under the Creative Commons Attribution-ShareAlike 3.0 Unported License, which means that you can copy and modify it as long as the entire work (including additions) remains under this license.