- Order:
- Duration: 1:20
- Published: 28 Jul 2009
- Uploaded: 22 Aug 2010
- Author: CEIdotorg
A delimiter is a sequence of one or more characters used to specify the boundary between separate, independent regions in plain text or other data streams. An example of a delimiter is the comma character, which acts as a field delimiter in a sequence of comma-separated values.
Delimiters represent one of various means to specify boundaries in a data stream. Declarative notation, for example, is an alternate method that uses a length field at the start of a data stream to specify the number of characters that the data stream contains.
For example, the CSV file format uses a comma as the delimiter between fields, and an end-of-line indicator as the delimiter between records. For instance: specifies a simple flat file database table using the CSV file format.
Common examples of bracket delimiters include:
Delimiters | Description |
( and ) | Parentheses. The Lisp programming language syntax is cited as recognizable primarily by its use of parentheses. |
{ and } | Curly brackets. |
< and > | Angle brackets. |
" and " | commonly used to denote string literals. |
' and ' | commonly used to denote string literals. |
/* and */ | used to denote comments in some programming languages. |
<% and %> | used in some web templates to specify language boundaries. These are also called template delimiters. |
Programming languages (See also, Comparison of programming languages (syntax)).
Field and Record delimiters (See also, ASCII, Control character).
==Delimiter collision== Delimiter collision is a problem that occurs when an author or programmer introduces delimiters into text without actually intending them to be interpreted as boundaries between separate regions. In the case of XML, for example, this can occur whenever an author attempts to specify an angle bracket character. In most file types there is both a field delimiter and a record delimiter, both of which are subject to collision. In the case of comma-separated values files, for example, field collision can occur whenever an author attempts to include a comma as part of a field value (e.g., salary = "$30,000"), and record delimiter collision would occur whenever a field contained multiple lines. Both record and field delimiter collision occur frequently in text files.
In some contexts, a malicious user or attacker may seek to exploit this problem intentionally. Consequently, delimiter collision can be the source of security vulnerabilities and exploits. Malicious users can take advantage of delimiter collision in languages such as SQL and HTML to deploy such well-known attacks as SQL injection and cross-site scripting, respectively.
produces the same output as:
One drawback of escape sequences, when used by people, is the need to memorize the codes that represent individual characters (see also: character entity reference, numeric character reference).
produces the desired output without requiring escapes. This approach, however, only works when the string does not contain both types of quotation marks.
produces the desired output without requiring escapes. Like regular escaping it can, however, become confusing when many quotes are used. The code to print the above source code would look more confusing:
For example in Perl:
all produce the desired output through use of the quotelike operator, which allows any convenient character to act as a delimiter. Although this method is more flexible, few languages support it. Perl and Ruby are two that do.
This is usually done by specifying a random sequence of characters followed by an identifying mark such as a UUID, a timestamp, or some other distinguishing mark. (See e.g., MIME, Here documents).
In specifying a regular expression, alternate delimiters may also be used to simplify the syntax for match and substitution operations in Perl.
For example, a simple match operation may be specified in Perl with the following syntax:
The syntax is flexible enough to specify match operations with alternate delimiters, making it easy to avoid delimiter collision:
print $string1 =~ m@httq://@; # match using alternate regular expression delimiter print $string1 =~ m{httq://}; # same as previous, but different delimiter print $string1 =~ m!httq://!; # same as previous, but different delimiter
This technique is used, for example, in Microsoft's ASP.NET web development technology, and is closely associated with the "VIEWSTATE" component of that system.
The first code fragment shows a simple HTML tag in which the VIEWSTATE value contains characters that are incompatible with the delimiters of the HTML tag itself:
This first code fragment is not well-formed, and would therefore not work properly in a "real world" deployed system.
In contrast, the second code fragment shows the same HTML tag, except this time incompatible characters in the VIEWSTATE value are removed through the application of base64 encoding:
This prevents delimiter collision and ensures that incompatible characters will not appear inside the HTML code, regardless of what characters appear in the original (decoded) text.
This text is licensed under the Creative Commons CC-BY-SA License. This text was originally published on Wikipedia and was developed by the Wikipedia community.