‌[ ]
Square brackets
( )  { }  ⟨ ⟩
Parentheses  Curly brackets  Chevrons
Punctuation
apostrophe ( ’ ' )
brackets ( [ ], ( ), { }, ⟨ ⟩ )
colon ( : )
comma ( , ، 、 )
dash ( , –, —, ― )
ellipsis ( …, ..., . . . )
exclamation mark ( ! )
full stop/period ( . )
guillemets ( « » )
hyphen ( )
hyphen-minus ( - )
question mark ( ? )
quotation marks ( ‘ ’, “ ”, ' ', " " )
semicolon ( ; )
slash‌/stroke‌/solidus ( /,  ⁄  )
Word dividers
space ( ) ( ) ( )
interpunct ( · )
General typography
ampersand ( & )
at sign ( @ )
asterisk ( * )
backslash ( \ )
bullet ( )
caret ( ^ )
dagger ( †, ‡ )
degree ( ° )
ditto mark ( )
inverted exclamation mark ( ¡ )
inverted question mark ( ¿ )
number sign‌/pound‌/hash ( # )
numero sign ( )
obelus ( ÷ )
ordinal indicator ( º, ª )
percent, per mil ( %, ‰, )
pilcrow ( )
prime ( ′, ″, ‴ )
section sign ( § )
tilde ( ~ )
underscore‌/understrike ( _ )
vertical bar‌/broken bar‌/pipe ( ¦, | )
Intellectual property
copyright symbol ( © )
registered trademark ( ® )
service mark ( )
sound recording copyright ( )
trademark ( )
Currency
currency (generic) ( ¤ )
currency (specific)
( ฿ ¢ $ ƒ £ ¥ )
Uncommon typography
asterism ( )
tee ( )
up tack ( )
index/fist ( )
therefore sign ( )
because sign ( )
interrobang ( )
irony punctuation ( ؟ )
lozenge ( )
reference mark ( )
tie ( )
Related
diacritical marks
whitespace characters
non-English quotation style ( « », „ ” )
In other scripts
Chinese punctuation
Wikipedia book Book  · Category Category  · Portal

Brackets are tall punctuation marks used in matched pairs within text, to set apart or interject other text. Used unqualified, brackets refer to different types of brackets in different parts of the world and in different contexts.

Contents

List of types[link]

  • ( ) — round brackets, open brackets, close brackets (UK), or parentheses
  • [ ] — square brackets, closed brackets, or brackets (US)
  • { } — curly brackets, definite brackets, swirly brackets, curly braces, birdie brackets, Scottish brackets, squirrelly brackets, braces, gullwings, seagull, fancy brackets, or DeLorean Brackets[citation needed]
  • ⟨ ⟩ — pointy brackets, angle brackets, triangular brackets, diamond brackets, tuples, or chevrons
  • < > — inequality signs, pointy brackets, or brackets. Sometimes referred to as angle brackets, in such cases as HTML markup. Occasionally known as broken brackets or brokets.[1]
  • ‹ ›; « » — angular quote brackets, or guillemets
  • ⸤ ⸥; 「 」 — corner brackets

History[link]

The chevron was the earliest type to appear in written English. Desiderius Erasmus coined the term lunula to refer to the rounded parentheses (), recalling the round shape of the moon.[2]

Usage[link]

In addition to referring to the class of all types of brackets, the unqualified word bracket is most commonly used to refer to a specific type of bracket. In modern American usage this is usually the square bracket.

In American usage, parentheses are usually considered separate from other brackets, and calling them "brackets" at all is unusual even though they serve a similar function. In more formal usage "parenthesis" may refer to the entire bracketed text, not just to the punctuation marks used (so all the text in this set of round brackets may be said to be a parenthesis or a parenthetical).[3]

According to early typographic practice, brackets are never set in italics, even when the surrounding characters are italic.[4]

Types[link]

[edit] Parentheses ( )

Parentheses (/pəˈrɛnθɨsz/) (singular, parenthesis (/pəˈrɛnθɨsɨs/)) – also called simply brackets, or round brackets, curved brackets, oval brackets, or, colloquially, parens – contain material that could be omitted without destroying or altering the meaning of a sentence. In most writing, overuse of parentheses is usually a sign of a badly structured text. A milder effect may be obtained by using a pair of commas as the delimiter, though if the sentence contains commas for other purposes visual confusion may result.

Parentheses may be used in formal writing to add supplementary information, such as "Sen. John McCain (R., Arizona) spoke at length." They can also indicate shorthand for "either singular or plural" for nouns – e.g., "the claim(s)" – or for "either masculine or feminine" in some languages with grammatical gender.[5]

Parenthetical phrases have been used extensively in informal writing and stream of consciousness literature. Of particular note is the southern American author William Faulkner (see Absalom, Absalom! and the Quentin section of The Sound and the Fury) as well as poet E. E. Cummings. Parentheses have historically been used where the dash is currently used – that is, in order to depict alternatives, such as "parenthesis)(parentheses". Examples of this usage can be seen in editions of Fowler's.

Parentheses may also be nested (generally with one set (such as this) inside another set). This is not commonly used in formal writing (though sometimes other brackets [especially square brackets] will be used for one or more inner set of parentheses [in other words, secondary {or even tertiary} phrases can be found within the main parenthetical sentence]).[6]

Any punctuation inside parentheses or other brackets is independent of the rest of the text: "Mrs. Pennyfarthing (What? Yes, that was her name!) was my landlady." In this usage the explanatory text in the parentheses is a parenthesis. (Parenthesized text is usually short and within a single sentence. Where several sentences of supplemental material are used in parentheses the final full stop would be within the parentheses. Again, the parenthesis implies that the meaning and flow of the text is supplemental to the rest of the text and the whole would be unchanged were the parenthesized sentences removed.)

Parentheses in mathematics signify a different precedence of operators. Normally, 2 + 3 × 4 would be 14, since the multiplication is done before the addition. On the other hand (2 + 3) × 4 is 20, because the parentheses override normal precedence, causing the addition to be done first. Some authors follow the convention in mathematical equations that, when parentheses have one level of nesting, the inner pair are parentheses and the outer pair are square brackets. Example:

Failed to parse (Missing texvc executable; please see math/README to configure.): [(2+3)\times4]^2=400


A related convention is that when parentheses have two levels of nesting, braces are the outermost pair.

Parentheses are also used to set apart the arguments in mathematical functions. For example, f(x) is the function f applied to the variable x. In coordinate systems parentheses are used to denote a set of coordinates; so in the Cartesian coordinate system (4, 7) may represent the point located at 4 on the x-axis and 7 on the y-axis. Parentheses may also represent intervals; (0,5), for example, is the interval between 0 and 5, not including 0 or 5.

Parentheses may also be used to represent a binomial coefficient, and in chemistry to denote a polyatomic ion.

In Chinese and Japanese, 【 】, a combination of brackets and parentheses called 方頭括號 and sumitsuki, are used for inference in Chinese and used in titles and headings in Japanese.

[edit] Square brackets [ ]

Square brackets – also called simply brackets (US) – are mainly used to enclose explanatory or missing material usually added by someone other than the original author, especially in quoted text.[7] Examples include: "I appreciate it [the honor], but I must refuse", and "the future of psionics [see definition] is in doubt". They may also be used to modify quotations. For example, if referring to someone's statement "I hate to do laundry", one could write: He "hate[s] to do laundry".

The bracketed expression "[sic]" is used after a quote or reprinted text to indicate the passage appears exactly as in the original source; a bracketed ellipsis [...] is often used to indicate deleted material; bracketed comments indicate when original text has been modified for clarity: "I'd like to thank [several unimportant people] and my parentals [sic] for their love, tolerance [...] and assistance [emphasis added]".[8]

Brackets are used in mathematics in a variety of notations, including standard notations for intervals, commutators, the floor function, the Lie bracket, the Iverson bracket, and matrices.

In translated works, brackets are used to signify the same word or phrase in the original language to avoid ambiguity.[9] For example: He is trained in the way of the open hand [karate].

When nested parentheses are needed, brackets are used as a substitute for the inner pair of parentheses within the outer pair.[10] When deeper levels of nesting are needed, convention is to alternate between parentheses and brackets at each level.

In linguistics, phonetic transcriptions are generally enclosed within brackets,[11] often using the International Phonetic Alphabet, while phonemic transcriptions typically use paired slashes.

Brackets can also be used in chemistry to represent the concentration of a chemical substance or to denote distributed charge in a complex ion.

Brackets (called move-left symbols or move right symbols) are added to the sides of text in proofreading to indicate changes in indentation:

Move left [To Fate I sue, of other means bereft, the only refuge for the wretched left.
Center ]Paradise Lost[
Move up Quote to be Moved Up.svg

Brackets are used to denote parts of the text that need to be checked when preparing drafts prior to finalizing a document. They often denote points that have not yet been agreed to in legal drafts and the year in which a report was made for certain case law decisions.

From the top: brackets, braces, parentheses, angle brackets/chevrons, and inequality signs

[edit] Curly brackets { }

Curly brackets – also called braces (US) or squiggly brackets (UK, informally[citation needed]) are sometimes used in prose to indicate a series of equal choices:[citation needed] "Select your animal {goat, sheep, cow, horse} and follow me". They are used in specialized ways in poetry and music (to mark repeats or joined lines). The musical terms for this mark joining staves are accolade and "brace", and connect two or more lines of music that are played simultaneously.[12] In mathematics they delimit sets. In many programming languages, they enclose groups of statements. Such languages (C being one of the best-known examples) are therefore called curly bracket languages. Some people use a brace to signify movement in a particular direction.[citation needed]

Presumably due to the similarity of the words brace and bracket (although they do not share an etymology), many people mistakenly treat brace as a synonym for bracket. Therefore, when it is necessary to avoid any possibility of confusion, such as in computer programming, it may be best to use the term curly bracket rather than brace. However, general usage in North American English favours the latter form.[citation needed] Indian programmers often use the name "flower bracket".[13]

In classical mechanics, curly brackets are often also used to denote the Poisson bracket between two quantities. It is defined as follows:

Failed to parse (Missing texvc executable; please see math/README to configure.): \{f,g\} = \sum_{i=1}^{N} \left[ \frac{\partial f}{\partial q_{i}} \frac{\partial g}{\partial p_{i}} - \frac{\partial f}{\partial p_{i}} \frac{\partial g}{\partial q_{i}} \right]


Angle brackets or chevrons ⟨ ⟩[link]

Chevrons ⟨ ⟩;[14] are often used to enclose highlighted material. Some dictionaries use chevrons to enclose short excerpts illustrating the usage of words.

In physical sciences, chevrons are used to denote an average over time or over another continuous parameter. For example,

Failed to parse (Missing texvc executable; please see math/README to configure.): \left\langle V(t)^2 \right\rangle = \lim_{T\to\infty} \frac{1}{T}\int_{-T/2}^{T/2} V(t)^2\,{\rm{d}}t.


The inner product of two vectors is commonly written as Failed to parse (Missing texvc executable; please see math/README to configure.): \langle a, b\rangle , but the notation (a, b) is also used.

In mathematical physics, especially quantum mechanics, it is common to write the inner product between elements as Failed to parse (Missing texvc executable; please see math/README to configure.): \langle a | b\rangle , as a short version of Failed to parse (Missing texvc executable; please see math/README to configure.): \langle a |\cdot| b\rangle , or Failed to parse (Missing texvc executable; please see math/README to configure.): \langle a | \hat{O} | b\rangle , where Failed to parse (Missing texvc executable; please see math/README to configure.): \hat{O}

is an operator. This is known as Dirac notation or bra-ket notation.

In set theory, chevrons or parentheses are used to denote ordered pairs and other tuples, whereas curly brackets are used for unordered sets.

In linguistics, chevrons indicate orthography, as in "The English word /kæt/ is spelled ⟨cat⟩." In epigraphy, they may be used for mechanical transliterations of a text into the Latin script.

In textual criticism, and hence in many editions of pre-modern works, chevrons denote sections of the text which are illegible or otherwise lost; the editor will often insert his own reconstruction where possible within them.

Chevrons are infrequently used to denote dialogue that is thought instead of spoken, such as:

⟨ What a beautiful flower! ⟩

The mathematical or logical symbols for greater-than (>) and less-than (<) are inequality symbols, and are not punctuation marks when so used. Nevertheless, true chevrons are not available on a typical computer keyboard, but the less-than and greater-than symbols are, so they are often substituted. They are loosely referred to as angled brackets or chevrons in this case.

Single and double pairs of comparison operators (<<, >>) (meaning much smaller than and much greater than) are sometimes used instead of guillemets («, ») (used as quotation marks in many languages) when the proper glyphs are not available.

In comic books, chevrons are often used to mark dialogue that has been translated notionally from another language; in other words, if a character is speaking another language, instead of writing in the other language and providing a translation, one writes the translated text within chevrons. Of course, since no foreign language is actually written, this is only notionally translated.[citation needed]

Chevron-like symbols are part of standard Chinese, and Korean punctuation, where they generally enclose the titles of books: ︿ and or and for traditional vertical printing, and and or and for horizontal printing. See also non-English usage of quotation marks.

Corner and half brackets 「」, ⌊ ⌋[link]

In East Asian punctuation, angle brackets are used as quotation marks. Half brackets are used in English to mark added text, such as in translations: "Bill saw ⌊her⌋".

The corner brackets ⌈ and ⌉ have at least two uses in mathematical logic: first, as a generalization of quotation marks, and second, to denote the gödel number of the enclosed expression.

In editions of papyrological texts, half brackets enclose text which is lacking in the papyrus due to damage, but can be restored by virtue of another source, such as an ancient quotation of the text transmitted by the papyrus.[15] For example, Callimachus Iambus 1.2 reads: ἐκ τῶν ὅκου βοῦν κολλύ⌊βου π⌋ιπρήσκουσιν. A hole in the papyrus has obliterated βου π, but these letters are supplied by an ancient commentary on the poem.

Double brackets ⟦ ⟧[link]

In formal semantics, double brackets, ⟦ ⟧, also called Strachey brackets, are used to indicate the semantic evaluation function. The CJK glyphs 〚 〛 look identical except they have added width. They can be typeset in LaTeX with the package stmaryrd.

Computing[link]

Encoding[link]

Representations of various kinds of brackets in ASCII, Unicode and HTML are given below.

Usage Unicode SGML/HTML/XML entities Sample
Quotation
(Western texts)
U+00AB Left double guillemet &#171; « words »
U+00BB Right double guillemet &#187;
U+2039 Left single guillemet &#8249; ‹ x ›
U+203A Right single guillemet &#8250;
General purpose U+0028 Left parenthesis &#40; &lparen; (parenthesis)
U+0029 Right parenthesis &#41; &rparen;
U+005B Left square bracket &#91; [sic]
U+005D Right square bracket &#93;
Technical/mathematical
(common)
U+003C Less-than sign &#60; &lt; <HTML>
U+003E Greater-than sign &#62; &gt;
U+007B Left curly bracket &#123; {round, square, curly}
U+007D Right curly bracket &#125;
Technical/mathematical
(specialized)
U+2308 Left ceiling &#4404; ceiling
U+2309 Right ceiling &#4405;
U+230A Left floor &#4406; floor
U+230B Right floor &#4407;
U+27E8 Mathematical left angle bracket &#10216; &lang;* a, b
U+27E9 Mathematical right angle bracket &#10217; &rang;*
Quotation
(halfwidth East-Asian texts)
U+2329 Left pointing angle bracket &#9001; &lang;* 〈deprecated〉
U+232A Right pointing angle bracket &#9002; &rang;*
U+FF62 Halfwidth left corner bracket &#65378; 「カタカナ」
U+FF63 Halfwidth right corner angle bracket &#65379;
Quotation
(fullwidth East-Asian texts)
U+3008 Left angle bracket &#12296; 〈한〉
U+3009 Right angle bracket &#12297;
U+300A Left double angle bracket &#12298; 《한한》
U+300B Right double angle bracket &#12299;
U+300C Left corner bracket &#12300; 「白八櫨」
U+300D Right corner bracket &#12301;
U+300E Left corner bracket &#12302; 『カタカナ』
U+300F Right corner bracket &#12303;
U+3010 Left thick square bracket &#12304; 【ひらがな】
U+3011 Right thick square bracket &#12305;
General purpose
(fullwidth East-Asian)
U+FF08 Fullwidth left parenthesis &#65288; (Wiki)
U+FF09 Fullwidth right parenthesis &#65289;
U+FF3B Fullwidth left square bracket &#65339; sic
U+FF3D Fullwidth right square bracket &#65341;
Technical/mathematical
(fullwidth East-Asian)
U+FF1C Fullwidth less-than sign &#65308; <HTML>
U+FF1E Fullwidth greater-than sign &#65310;
U+FF5B Fullwidth left curly bracket &#65371; {1、2}
U+FF5D Fullwidth right curly bracket &#65373;

*&lang; and &rang; were tied to the deprecated symbols U+2329 and U+232A in HTML4 and MathML2, but are being migrated to U+27E8 and U+27E9 for HTML5 and MathML3.

Braces (curly brackets) first became part of a character set with the 8-bit code of the IBM 7030 Stretch.[16]

The angle brackets or chevrons at U+27E8 and U+27E9 are for mathematical use and Western languages, while U+3008 and U+3009 are for East Asian languages. The chevrons at U+2329 and U+232A are deprecated in favour of the U+3008 and U+3009 East Asian angle brackets. Unicode discourages their use for mathematics and in Western texts[17] because they are canonically equivalent to the CJK code points U+300x and thus likely to render as double-width symbols. The less-than and greater-than symbols are often used as replacements for chevrons.

These various bracket characters are frequently used in many computer languages as operators or for other syntax markup. The more common uses follow.

Uses of "(" and ")"[link]

  • are often used to define the syntactic structure of expressions, overriding operator precedence: a*(b+c) has subexpressions a and b+c, whereas a*b+c has subexpressions a*b and c
  • passing parameters or arguments to functions, especially in C and similar languages, and invoking a function or function-like construct: substring($val,10,1)
  • in Lisp, they open and close s-expressions and therefore function applications: (cons a b)
  • in many regular expression syntaxes parentheses create a capturing group, allowing the matched portion inside to be retrieved by the user
  • in Forth, they open and close comments in the code.
  • in Fortran-family and COBOL languages, they are also used for array references
  • in the Perl programming language through Perl 5, they are used to define lists, static array-like structures; this idiom is extended to their use as containers of subroutine (function) arguments
  • in the Perl 6 programming language, they define captures, a structure that defers contextual interpretation. This usage extends to ordinary parentheses as well. They are also used to indicate arguments to function calls and to declare signatures of formal parameters or other variables.
  • in Python they are used to disambiguate tuple literals (immutable ordered lists), which are usually formed by commas, in places where parentheses and commas would otherwise be a part of a function call.
  • in Tcl they are used to enclose the name of an element of an associative array variable
  • in joined brackets in a table form going vertically downwards, a ")" refers to repetition of a term for the number of items towards the left of this joined list of brackets.

Uses of "[" and "]"[link]

  • to refer to elements of an array or associative array, and sometimes to define the number of elements in an array, especially in C-like languages: queue[3]
  • in many languages, may be used to define a literal anonymous array or list: [5, 10, 15]
  • in most regular expression syntaxes brackets denote a character class: a set of possible characters to choose from
  • in Forth, "[" causes the system to enter interpretation state and "]" causes the system to enter compilation state. For example, within a definition, [ 2 3 + ] literal causes the compiler to switch to the interpreter mode, calculate expression 2+3, leave the result on stack and resume compilation. As a result, a literal constant "5" will be compiled into the definition, instead of the whole expression.
  • in Tcl, they enclose a sub-script to be evaluated and the result substituted
  • in some of Microsoft's .NET (CLI) languages, most notably C# and C++, they are used to denote metadata attributes.
  • in x86 assembly implementations such as FASM, they are used to distinguish pointers from their data.
  • in Smalltalk, brackets are used to delineate "blocks" or "block closures", grouping of code that can be executed immediately or later via messages send such as "value" sent to the block. Blocks are full first class objects in Smalltalk.
  • in Objective-C, brackets are used to send a message to (i.e. call a method on) an object
  • on Unix, "[" is a shorthand for the test command
  • in JSON they are used to define an array. (an ordered sequence of comma-separated values)
  • in programming documentation and metalanguages (e.g. in descriptions of operator or command syntax), optional elements are enclosed in square brackets. For example, "echo [-n] [-e] <text>" means that the -n and -e parameters are optional.
  • delimiting IPv6 addresses in URLs, for example http://[2001:db8:3c4d:15::abcd:ef12]:8080

Uses of "{" and "}"[link]

  • are used in some programming languages to define the beginning and ending of blocks of code or data. Languages which use this convention are said to belong to the curly bracket family of programming languages
  • are used to represent certain type definitions or literal data values, such as a composite structure or associative array
  • in mathematics they enclose elements of a set and denote a set
  • in Curl they are used to delimit expressions and statements (similar to Lisp's use of parenthesis).
  • in Pascal they define the beginning and ending of comments
  • in most regular expression syntaxes, they are used as quantifiers, matching n repetitions of the previous group
  • in Perl they are also used to refer to elements of an associative array
  • in PHP they are used to determine structures.
  • in Tcl they enclose a string to be substituted without any internal substitutions being performed
  • in Python and Ruby they are used for dictionaries (a mutable set of key: value pairs, separated by commas) and for sets.
  • in TeX/LaTeX they can be used for grouping parts sharing the same local format, wrap parameters, or definitions, depending on the local catcode value
  • in JSON they are used to define an object (an unordered collection of key:value pairs)
  • in metalanguages (e.g. in descriptions of operator or command syntax), possible alternatives are enclosed in braces, if at least one is mandatory.
  • These are also used in music at the start of a stave.
  • In Objective-C they are used to mark the start and/or end of a process.

Uses of "<" and ">"[link]

These symbols are used in pairs as if they are brackets,

  • to set apart URLs and e-mail addresses in text, such as "I found it on Example.com <http://www.example.com/>" and "This photo is copyrighted by John Smith <johnsmith@example.com>". This is also the computer-readable form for addresses in e-mail headers, specified by RFC 2822.
  • In documentation, often used to specify parameters or other user-specified information (e.g. "The command 'echo <text>' can be used to display <text>")
  • to enclose code tags in SGML, HTML, and XML (e.g. <div>)
  • to target children of a parent element in CSS (e.g. ul.main>li whereas all direct child selectors of the ul.main tag are targetted.)
  • in C++, C#, and Java they delimit generic arguments
  • in Perl through Perl 5 they are used to read a line from an input source
  • in Perl 6 they combine quoting and associative array lookup
  • in BNF, they're used to denote nonterminals (e.g. <name> ::= <first-name> <last-name>)
  • in ABAP they denote field symbols – placeholders or symbolic names for other fields, which can point to any data object.
  • to indicate an action or status (e.g. <Waves> or <Offline>), particularly in online, real-time text-based discussions (instant messaging, bulletin boards, etc). (Here, asterisks can also be used to signify an action.)

When not used in pairs to delimit text (not acting as brackets):

Layout styles[link]

In normal writing (prose) an opening bracket is rarely left hanging at the end of a line of text nor is a closing bracket permitted to start one. However, in computer code this is often done intentionally to aid readability. For example, a bracketed list of items separated by semicolons may be written with the brackets on separate lines, and the items, followed by the semicolon, each on one line.

A common error in programming is mismatching braces; accordingly, many IDEs have braces matching to highlight matching pairs.

Mathematics[link]

In addition to the use of parentheses to specify the order of operations, both parentheses and brackets are used to denote an interval, also referred to as a half-open range. The notation [a,c) is used to indicate an interval from a to c that is inclusive of a but exclusive of c. That is, [5, 12) would be the set of all real numbers between 5 and 12, including 5 but not 12. The numbers may come as close as they like to 12, including 11.999 and so forth (with any finite number of 9s), but 12.0 is not included. In Europe, the notation [5, 12[ is also used for this. The endpoint adjoining the bracket is known as closed, while the endpoint adjoining the parenthesis is known as open. If both types of brackets are the same, the entire interval may be referred to as closed or open as appropriate. Whenever +∞ or −∞ is used as an endpoint, it is normally considered open and adjoined to a parenthesis. See Interval (mathematics) for a more complete treatment.

In quantum mechanics, chevrons are also used as part of Dirac's formalism, bra-ket notation, to note vectors from the dual spaces of the Bra A| and the Ket |B. Mathematicians will also commonly write a, b for the inner product of two vectors. In statistical mechanics, chevrons denote ensemble or time average. Chevrons are used in group theory to write group presentations, and to denote the subgroup generated by a collection of elements. Note that obtuse angled chevrons are not always (and even not by all users) distinguished from a pair of less-than and greater-than signs <>, which are sometimes used as a typographic approximation of chevrons.

In group theory and ring theory, brackets denote the commutator. In group theory, the commutator [g, h] is commonly defined as g −1h −1gh. In ring theory, the commutator [a, b] is defined as abba. Furthermore, in ring theory, braces denote the anticommutator where {a, b} is defined as ab + ba. The bracket is also used to denote the Lie derivative, or more generally the Lie bracket in any Lie algebra.

Various notations, like the vinculum have a similar effect to brackets in specifying order of operations, or otherwise grouping several characters together for a common purpose.

In the Z formal specification language, braces define a set and chevrons define a sequence.

Accounting[link]

Traditionally in accounting, negative amounts are placed in parentheses.

Law[link]

Brackets are used in some countries in the citation of law reports to identify parallel citations to non-official reporters. For example: Chronicle Pub. Co. v. Superior Court, (1998) 54 Cal.2d 548, [7 Cal.Rptr. 109]. In some other countries (such as England and Wales), square brackets are used to indicate that the year is part of the citation, as opposed to optional information. For example, National Coal Board v England [1954] AC 403, (1954) 98 Sol Jo 176 – the case report is in the 1954 volume of the Appeal Cases reports (year not optional) and in volume 98 of the Solicitor's Journal (year optional, since the volumes are numbered, and so given in round brackets).

When quoted material is in any way altered, the alterations are enclosed in brackets within the quotation. For example: Plaintiff asserts his cause is just, stating, "[m]y causes is [sic] just." While in the original quoted sentence the word "my" was capitalized, it has been modified in the quotation and the change signalled with brackets. Similarly, where the quotation contained a grammatical error, the quoting author signalled that the error was in the original with "[sic]" (Latin for "thus"). (California Style Manual, section 4:59 (4th ed.))

Sports[link]

Tournament brackets, the diagrammatic representation of the series of games played during a tournament usually leading to a single winner, are so named for their resemblance to brackets or braces.

Typing[link]

In roleplaying, and writing, brackets are used for out-of-speech sentences (otherwise known as OOC, out-of-character). Example:

(What's your name?)

To avoid ambiguity as to whether this is an in-character parenthetical statement or an out-of-character statement, in many circles double brackets are used, as they are unheard of in standard writing.

((How long have you played here?))

See also[link]

References[link]

  1. ^ http://catb.org/jargon/html/B/broket.html
  2. ^ Truss, Lynne. Eats, Shoots & Leaves, 2003. p. 161. ISBN 1-59240-087-6.
  3. ^ The Free Online Dictionary
  4. ^ Robert Bringhurst, The Elements of Typographic Style, §5.3.2.
  5. ^ Slash (punctuation)#Gender-neutrality in Spanish and Portuguese
  6. ^ Fogarty, Mignon. "Parentheses, Brackets, and Braces". Quick and Dirty Tips. http://grammar.quickanddirtytips.com/parentheses-brackets-and-braces.aspx. Retrieved 27 March 2011. 
  7. ^ The Chicago Manual of Style, 15th ed., The University of Chicago Press, 2003, §6.104
  8. ^ The Columbia Guide to Standard American English
  9. ^ The Chicago Manual of Style, 15th ed., The University of Chicago Press, 2003, §6.105
  10. ^ The Chicago Manual of Style, 15th ed., The University of Chicago Press, 2003, §6.102 and §6.106
  11. ^ The Chicago Manual of Style, 15th ed., The University of Chicago Press, 2003, §6.107
  12. ^ Decodeunicode.org > U+007B LEFT CURLY BRACKET Retrieved on May 3, 2009
  13. ^ K R Venugopal, Rajkumar Buyya, T Ravishankar. Mastering C++, 1999. p. 34. ISBN 0-07-463454-2.
  14. ^ Some fonts don't display these characters correctly. Please refer to the image on the right instead.
  15. ^ M.L. West (1973) Textual Criticism and Editorial Technique (Stuttgart) 81.
  16. ^ Bob, Bemer. "The Great Curly Brace Trace Chase". http://www.bobbemer.com/BRACES.HTM. Retrieved 2009-09-05 
  17. ^ "Miscellaneous Technical", The Unicode Standard, Version 6.1, 2012, http://www.unicode.org/charts/PDF/U2300.pdf, retrieved 2012-02-01 
  18. ^ Bryant, Randal E.; O'Hallaron, David. Computer Systems: A Programmer's Perspective, 2003. p. 794. ISBN 0-13-034074-X.

Bibliography[link]

  • Lennard, John (1991). But I Digress: The Exploitation of Parentheses in English Printed Verse. Oxford: Clarendon Press. ISBN 0-19-811247-5. 
  • Turnbull; et al. (1964). The Graphics of Communication. New York: Holt.  States that what are depicted as brackets above are called braces and braces are called brackets. This was the terminology in US printing prior to computers.

http://wn.com/Bracket




This page contains text from Wikipedia, the Free Encyclopedia - http://en.wikipedia.org/wiki/Bracket

This article is licensed under the Creative Commons Attribution-ShareAlike 3.0 Unported License, which means that you can copy and modify it as long as the entire work (including additions) remains under this license.


Text file created with gedit and viewed with a hex editor
Besides the text objects there are only EOL markers
with the hexadecimal value 0A.

In computing, a newline,[1] also known as a line break or end-of-line (EOL) marker, is a special character or sequence of characters signifying the end of a line of text. The name comes from the fact that the next character after the newline will appear on a new line—that is, on the next line below the text immediately preceding the newline. The actual codes representing a newline vary across operating systems, which can be a problem when exchanging text files between systems with different newline representations.

There is also some confusion whether newlines terminate or separate lines. If a newline is considered a separator, there will be no newline after the last line of a file. The general convention on most systems is to add a newline even after the last line, i.e. to treat newline as a line terminator. Some programs have problems processing the last line of a file if it is not newline terminated. Conversely, programs that expect newline to be used as a separator will interpret a final newline as starting a new (empty) line.

In text intended primarily to be read by humans using software which implements the word wrap feature, a newline character typically only needs to be stored if a line break is required independent of whether the next word would fit on the same line, such as between paragraphs and in vertical lists. See hard return and soft return.

Contents

Representations[link]

Software applications and operating systems usually represent a newline with one or two control characters:

  • Systems based on ASCII or a compatible character set use either LF (Line feed, '\n', 0x0A, 10 in decimal) or CR (Carriage return, '\r', 0x0D, 13 in decimal) individually, or CR followed by LF (CR+LF, '\r\n', 0x0D0A). These characters are based on printer commands: The line feed indicated that one line of paper should feed out of the printer thus instructed the printer to advance the paper one line, and a carriage return indicated that the printer carriage should return to the beginning of the current line. Some rare systems, such as QNX before version 4, used the ASCII RS (record separator, 0x1E, 30 in decimal) character as the newline character.
  • EBCDIC systems—mainly IBM mainframe systems, including z/OS (OS/390) and i5/OS (OS/400)—use NEL (Next Line, 0x15) as the newline character. Note that EBCDIC also has control characters called CR and LF, but the numerical value of LF (0x25) differs from the one used by ASCII (0x0A). Additionally, there are some EBCDIC variants that also use NEL but assign a different numeric code to the character.
  • Operating systems for the CDC 6000 series defined a newline as two or more zero-valued six-bit characters at the end of a 60-bit word. Some configurations also defined a zero-valued character as a colon character, with the result that multiple colons could be interpreted as a newline depending on position.
  • ZX80 and ZX81, home computers from Sinclair Research Ltd used a specific non-ASCII character set with code NEWLINE (0x76, 118 decimal) as the newline character.
  • OpenVMS uses a record-based file system, which stores text files as one record per line. In most file formats, no line terminators are actually stored, but the Record Management Services facility can transparently add a terminator to each line when it is retrieved by an application. The records themselves could contain the same line terminator characters, which could either be considered a feature or a nuisance depending on the application.
  • Fixed line length was used by some early mainframe operating systems. In such a system, an implicit end-of-line was assumed every 80 characters, for example. No newline character was stored. If a file was imported from the outside world, lines shorter than the line length had to be padded with spaces, while lines longer than the line length had to be truncated. This mimicked the use of punched cards, on which each line was stored on a separate card, usually with 80 columns on each card. Many of these systems added an carriage control character to the start of the next record, this could indicate if the next record was a continuation of the line started by the previous record, or a new line, or should overprint the previous line (similar to a CR). Often this was a normal printing character such as '#' that thus could not be used as the first character in a line. Some early line printers interpreted these characters directly in the records sent to them.

Most textual Internet protocols (including HTTP, SMTP, FTP, IRC and many others) mandate the use of ASCII CR+LF (0x0D 0x0A) on the protocol level, but recommend that tolerant applications recognize lone LF as well. In practice, there are many applications that erroneously use the C newline character '\n' instead (see section Newline in programming languages below). This leads to problems when trying to communicate with systems adhering to a stricter interpretation of the standards; one such system is the qmail MTA that actively refuses to accept messages from systems that send bare LF instead of the required CR+LF.[2]

FTP has a feature to transform newlines between CR+LF and LF only when transferring text files. This must not be used on binary files. Usually binary files and text files are recognised by checking their filename extension.

Unicode[link]

The Unicode standard defines a large number of characters that conforming applications should recognize as line terminators:[3]

 LF:    Line Feed, U+000A
 VT:    Vertical Tab, U+000B
 FF:    Form Feed, U+000C
 CR:    Carriage Return, U+000D
 CR+LF: CR (U+000D) followed by LF (U+000A)
 NEL:   Next Line, U+0085
 LS:    Line Separator, U+2028
 PS:    Paragraph Separator, U+2029

This may seem overly complicated compared to an approach such as converting all line terminators to a single character, for example LF. However, Unicode was designed to preserve all information when converting a text file from any existing encoding to Unicode and back. Therefore, Unicode should contain characters included in existing encodings. NEL is included in ISO-8859-1[citation needed] and EBCDIC (0x15). The approach taken in the Unicode standard allows round-trip transformation to be information-preserving while still enabling applications to recognize all possible types of line terminators.

Recognizing and using the newline codes greater than 0x7F is not often done. They are multiple bytes in UTF-8 and the code for NEL has been used as the ellipsis ('…') character in Windows-1252. For instance:

  • YAML[4] no longer recognizes them as special in order to be compatible with JSON.
  • ECMAScript[5] accepts LS and PS as line breaks, but considers U+0085 (NEL) white space, not a line break.
  • Microsoft Windows 2000 does not treat any of NEL, LS or PS as line-break in the default text editor Notepad
  • In Linux, a popular editor "gedit" treats LS and PS as newlines but does not for NEL.

History[link]

ASCII was developed simultaneously by the ISO and the ASA, the predecessor organization to ANSI. During the period of 1963–1968, the ISO draft standards supported the use of either CR+LF or LF alone as a newline, while the ASA drafts supported only CR+LF.

The sequence CR+LF was in common use on many early computer systems that had adopted Teletype machines, typically a Teletype Model 33 ASR, as a console device, because this sequence was required to position those printers at the start of a new line. On these systems, text was often routinely composed to be compatible with these printers, since the concept of device drivers hiding such hardware details from the application was not yet well developed; applications had to talk directly to the Teletype machine and follow its conventions.

Most minicomputer systems from DEC used this convention. CP/M used it as well, to print on the same terminals that minicomputers used. From there MS-DOS (1981) adopted CP/M's CR+LF in order to be compatible, and this convention was inherited by Microsoft's later Windows operating system.

The separation of the two functions concealed the fact that the print head could not return from the far right to the beginning of the next line in one-character time. That is why the sequence was always sent with the CR first. In fact, it was often necessary to send extra characters (extraneous CRs or NULs, which are ignored) to give the print head time to move to the left margin. Even many early video displays required multiple character times to scroll the display.

The Multics operating system began development in 1964 and used LF alone as its newline. Multics used a device driver to translate this character to whatever sequence a printer needed (including extra padding characters), and the single byte was much more convenient for programming. The seemingly more obvious choice of CR was not used, as a plain CR provided the useful function of overprinting one line with another, and thus it was useful to not translate it. Unix followed the Multics practice, and later systems followed Unix.

In programming languages[link]

To facilitate the creation of portable programs, programming languages provide some abstractions to deal with the different types of newline sequences used in different environments.

The C programming language provides the escape sequences '\n' (newline) and '\r' (carriage return). However, these are not required to be equivalent to the ASCII LF and CR control characters. The C standard only guarantees two things:

  1. Each of these escape sequences maps to a unique implementation-defined number that can be stored in a single char value.
  2. When writing a file in text mode, '\n' is transparently translated to the native newline sequence used by the system, which may be longer than one character. When reading in text mode, the native newline sequence is translated back to '\n'. In binary mode, no translation is performed, and the internal representation produced by '\n' is output directly.

On Unix platforms, where C originated, the native newline sequence is ASCII LF (0x0A), so '\n' was simply defined to be that value. With the internal and external representation being identical, the translation performed in text mode is a no-op, and text mode and binary mode behave the same. This has caused many programmers who developed their software on Unix systems simply to ignore the distinction completely, resulting in code that is not portable to different platforms.

The C library function fgets() is best avoided in binary mode because any file not written with the UNIX newline convention will be misread. Also, in text mode, any file not written with the system's native newline sequence (such as a file created on a UNIX system, then copied to a Windows system) will be misread as well.

Another common problem is the use of '\n' when communicating using an Internet protocol that mandates the use of ASCII CR+LF for ending lines. Writing '\n' to a text mode stream works correctly on Windows systems, but produces only LF on Unix, and something completely different on more exotic systems. Using "\r\n" in binary mode is slightly better.

Many languages, such as C++, Perl,[6] and Haskell provide the same interpretation of '\n' as C.

Java, PHP,[7] and Python[8] provide the '\r\n' sequence (for ASCII CR+LF). In contrast to C, these are guaranteed to represent the values U+000A and U+000D, respectively.

The Java I/O libraries do not transparently translate these into platform-dependent newline sequences on input or output. Instead, they provide functions for writing a full line that automatically add the native newline sequence, and functions for reading lines that accept any of CR, LF, or CR+LF as a line terminator (see BufferedReader.readLine()). The System.getProperty() method can be used to retrieve the underlying line separator.

Example:

  String eol = System.getProperty( "line.separator" );
  String lineColor = "Color: Red" + eol;

Python permits "Universal Newline Support" when opening a file for reading, when importing modules, and when executing a file.[9]

Some languages have created special variables, constants, and subroutines to facilitate newlines during program execution.

Common problems[link]

The different newline conventions often cause text files that have been transferred between systems of different types to be displayed incorrectly. For example, files originating on Unix or Apple Macintosh systems may appear as a single long line on some Windows programs. Conversely, when viewing a file originating from a Windows computer on a Unix system, the extra CR may be displayed as ^M at the end of each line or as a second line break.

The problem can be hard to spot if some programs handle the foreign newlines properly while others do not. For example, a compiler may fail with obscure syntax errors even though the source file looks correct when displayed on the console or in an editor. On a Unix system, the command cat -v myfile.txt will send the file to stdout (normally the terminal) and make the ^M visible, which can be useful for debugging. Modern text editors generally recognize all flavours of CR / LF newlines and allow the user to convert between the different standards. Web browsers are usually also capable of displaying text files and websites which use different types of newlines.

The File Transfer Protocol can automatically convert newlines in files being transferred between systems with different newline representations when the transfer is done in "ASCII mode". However, transferring binary files in this mode usually has disastrous results: Any occurrence of the newline byte sequence—which does not have line terminator semantics in this context, but is just part of a normal sequence of bytes—will be translated to whatever newline representation the other system uses, effectively corrupting the file. FTP clients often employ some heuristics (for example, inspection of filename extensions) to automatically select either binary or ASCII mode, but in the end it is up to the user to make sure his or her files are transferred in the correct mode. If there is any doubt as to the correct mode, binary mode should be used, as then no files will be altered by FTP, though they may display incorrectly.

Conversion utilities[link]

Text editors are often used for converting a text file between different newline formats; most modern editors can read and write files using at least the different ASCII CR/LF conventions. The standard Windows editor Notepad is not one of them (although Wordpad and the MS-DOS Editor are).

Editors are often unsuitable for converting larger files. For larger files (on Windows NT/2000/XP) the following command is often used:

TYPE unix_file | FIND "" /V > dos_file

On many Unix systems, the dos2unix (sometimes named fromdos or d2u) and unix2dos (sometimes named todos or u2d) utilities are used to translate between ASCII CR+LF (DOS/Windows) and LF (Unix) newlines. Different versions of these commands vary slightly in their syntax. However, the tr command is available on virtually every Unix-like system and is used to perform arbitrary replacement operations on single characters. A DOS/Windows text file can be converted to Unix format by simply removing all ASCII CR characters with

tr -d '\r' < inputfile > outputfile

or, if the text has only CR newlines, by converting all CR newlines to LF with

tr '\r' '\n' < inputfile > outputfile

The same tasks are sometimes performed with awk, sed, Tr_(Unix) or in Perl if the platform has a Perl interpreter:

awk '{sub("$","\r\n"); printf("%s",$0);}' inputfile > outputfile  # UNIX to DOS  (adding CRs on Linux and BSD based OS that haven't GNU extensions)
awk '{gsub("\r",""); print;}' inputfile > outputfile              # DOS to UNIX  (removing CRs on Linux and BSD based OS that haven't GNU extensions)
sed -e 's/$/\r/' inputfile > outputfile              # UNIX to DOS  (adding CRs on Linux based OS that use GNU extensions)
sed -e 's/\r$//' inputfile > outputfile              # DOS  to UNIX (removing CRs on Linux based OS that use GNU extensions)
cat inputfile | tr -d "\r" > outputfile              # DOS  to UNIX (removing CRs using tr(1). Not Unicode compliant.)
perl -pe 's/\r?\n|\r/\r\n/g' inputfile > outputfile  # Convert to DOS
perl -pe 's/\r?\n|\r/\n/g'   inputfile > outputfile  # Convert to UNIX
perl -pe 's/\r?\n|\r/\r/g'   inputfile > outputfile  # Convert to old Mac

To identify what type of line breaks a text file contains, the file command can be used. Moreover, the editor Vim can be convenient to make a file compatible with the Windows notepad text editor. For example:

[prompt] > file myfile.txt
myfile.txt: ASCII English text
[prompt] > vim myfile.txt
  within vim :set fileformat=dos
             :wq
[prompt] > file myfile.txt
myfile.txt: ASCII English text, with CRLF line terminators

The following grep commands echo the filename (in this case myfile.txt) to the command line if the file is of the specified style:

grep -PL $'\r\n' myfile.txt # show UNIX style file (LF terminated)
grep -Pl $'\r\n' myfile.txt # show DOS style file (CRLF terminated)

For Debian-based systems, these commands are used:

egrep -L $'\r\n' myfile.txt # show UNIX style file (LF terminated)
egrep -l $'\r\n' myfile.txt # show DOS style file (CRLF terminated)

The above grep commands work under Unix systems or in Cygwin under Windows. Note that these commands make some assumptions about the kinds of files that exist on the system (specifically it's assuming only UNIX and DOS-style files—no Mac OS 9-style files).

This technique is often combined with find to list files recursively. For instance, the following command checks all "regular files" (e.g. it will exclude directories, symbolic links, etc.) to find all UNIX-style files in a directory tree, starting from the current directory (.), and saves the results in file unix_files.txt, overwriting it if the file already exists:

find . -type f -exec grep -PL '\r\n' {} \; > unix_files.txt

This example will find C files and convert them to LF style line endings:

find -name '*.[ch]' -exec fromdos {} \;

The file command also detects the type of EOL used:

file myfile.txt
> myfile.txt: ASCII text, with CRLF line terminators

Other tools permit the user to visualise the EOL characters:

od -a myfile.txt
cat -e myfile.txt
hexdump -c myfile.txt

dos2unix, unix2dos, mac2unix, unix2mac, mac2dos, dos2mac can perform conversions. The flip[10] command is often used.

See also[link]

References[link]

  1. ^ The origin of the older computer term "CRLF" - which redirects to this Newline article - or "Carriage Return [and] Line Feed", derives from standard manual typewriter design, whereby at the end of a line of text the typist pushes a lever at the left end of the carriage to return it to position for beginning the next line. In so doing, a mechanism also rolls the typewriter's platen by one line, advancing ("feeding") the paper to the correct position.
  2. ^ cr.yp.to
  3. ^ UTR #13: Unicode Newline Guidelines
  4. ^ YAML Ain't Markup Language (YAML™) Version 1.2
  5. ^ "ECMAScript Language Specification 5th edition". ECMA International. December 2009. p. 15. http://www.ecma-international.org/publications/files/ECMA-ST/ECMA-262.pdf. Retrieved 4 April 2010. 
  6. ^ binmode - perldoc.perl.org
  7. ^ PHP: Strings - Manual
  8. ^ Lexical analysis – Python v3.0.1 documentation
  9. ^ What's new in Python 2.3
  10. ^ ASCII text converstion between UNIX, Macintosh, MS-DOS

External links[link]

http://wn.com/Newline




This page contains text from Wikipedia, the Free Encyclopedia - http://en.wikipedia.org/wiki/Newline

This article is licensed under the Creative Commons Attribution-ShareAlike 3.0 Unported License, which means that you can copy and modify it as long as the entire work (including additions) remains under this license.