1. Introduction
This section is informative.
Web applications should have the ability to manipulate as wide as possible a range of user input, including files that a user may wish to upload to a remote server or manipulate inside a rich web application. This specification defines the basic representations for files, lists of files, errors raised by access to files, and programmatic ways to read files. Additionally, this specification also defines an interface that represents "raw data" which can be asynchronously processed on the main thread of conforming user agents. The interfaces and API defined in this specification can be used with other interfaces and APIs exposed to the web platform.
The File
interface represents file data typically obtained from the underlying file system,
and the Blob
interface
("Binary Large Object" - a name originally introduced to web APIs in Google Gears)
represents immutable raw data. File
or Blob
reads should happen asynchronously on the main thread,
with an optional synchronous API used within threaded web applications.
An asynchronous API for reading files prevents blocking and UI "freezing" on a user agent’s main thread.
This specification defines an asynchronous API based on an event model to read and access a File
or Blob
’s data.
A FileReader
object provides asynchronous read methods to access that file’s data
through event handler content attributes and the firing of events.
The use of events and event handlers allows separate code blocks the ability
to monitor the progress of the read (which is particularly useful for remote drives or mounted drives,
where file access performance may vary from local drives)
and error conditions that may arise during reading of a file.
An example will be illustrative.
function startRead() { // obtain input element through DOM var file = document.getElementById('file').files[0]; if(file){ getAsText(file); } } function getAsText(readFile) { var reader = new FileReader(); // Read file into memory as UTF-16 reader.readAsText(readFile, "UTF-16"); // Handle progress, success, and errors reader.onprogress = updateProgress; reader.onload = loaded; reader.onerror = errorHandler; } function updateProgress(evt) { if (evt.lengthComputable) { // evt.loaded and evt.total are ProgressEvent properties var loaded = (evt.loaded / evt.total); if (loaded < 1) { // Increase the prog bar length // style.width = (loaded * 200) + "px"; } } } function loaded(evt) { // Obtain the read file data var fileString = evt.target.result; // Handle UTF-16 file dump if(utils.regexp.isChinese(fileString)) { //Chinese Characters + Name validation } else { // run other charset test } // xhr.send(fileString) } function errorHandler(evt) { if(evt.target.error.name == "NotReadableError") { // The file could not be read } }
2. Terminology and Algorithms
When this specification says to terminate an algorithm the user agent must terminate the algorithm after finishing the step it is on.
Asynchronous read methods defined in this specification may return before the algorithm in question is terminated,
and can be terminated by an abort()
call.
The algorithms and steps in this specification use the following mathematical operations:
-
max(a,b) returns the maximum of a and b, and is always performed on integers as they are defined in WebIDL [WebIDL]; in the case of max(6,4) the result is 6. This operation is also defined in ECMAScript [ECMA-262].
-
min(a,b) returns the minimum of a and b, and is always performed on integers as they are defined in WebIDL [WebIDL]; in the case of min(6,4) the result is 4. This operation is also defined in ECMAScript [ECMA-262].
-
Mathematical comparisons such as < (less than), ≤ (less than or equal to), and > (greater than) are as in ECMAScript [ECMA-262].
The term Unix Epoch is used in this specification to refer to the time 00:00:00 UTC on January 1 1970
(or 1970-01-01T00:00:00Z ISO 8601);
this is the same time that is conceptually "0
" in ECMA-262 [ECMA-262].
3. The Blob Interface and Binary Data
A Blob
object refers to a byte sequence,
and has a size
attribute which is the total number of bytes in the byte sequence,
and a type
attribute,
which is an ASCII-encoded string in lower case representing the media type of the byte sequence.
Each Blob
must have an internal snapshot state,
which must be initially set to the state of the underlying storage,
if any such underlying storage exists.
Further normative definition of snapshot state can be found for File
s.
[Constructor
(optional sequence<BlobPart> blobParts, optional BlobPropertyBagoptions
), Exposed=(Window,Worker), Serializable] interfaceBlob
{ readonly attribute unsigned long long size; readonly attribute DOMString type; // slice Blob into byte-ranged chunks Blob slice([Clamp] optional long long start, [Clamp] optional long long end, optional DOMString contentType); }; enumEndingType
{"transparent"
,"native"
}; dictionaryBlobPropertyBag
{ DOMString type = ""; EndingType endings = "transparent"; }; typedef (BufferSource or Blob or USVString)BlobPart
;
Blob
objects are serializable objects. Their serialization steps,
given value and serialized, are:
-
Set serialized.[[SnapshotState]] to value’s snapshot state.
-
Set serialized.[[ByteSequence]] to value’s underlying byte sequence.
Their deserialization step, given serialized and value, are:
-
Set value’s snapshot state to serialized.[[SnapshotState]].
-
Set value’s underlying byte sequence to serialized.[[ByteSequence]].
3.1. Constructors
Blob()
constructor can be invoked with zero or more parameters.
When the Blob()
constructor is invoked,
user agents must run the following steps:
-
If invoked with zero parameters, return a new
Blob
object consisting of 0 bytes, withsize
set to 0, and withtype
set to the empty string. -
Let bytes be the result of processing blob parts given
blobParts
andoptions
. -
If the
type
member of the optionaloptions
argument is provided and is not the empty string, run the following sub-steps:-
Let t be the
type
dictionary member. If t contains any characters outside the range U+0020 to U+007E, then set t to the empty string and return from these substeps. -
Convert every character in t to ASCII lowercase.
-
-
Return a
Blob
object referring to bytes as its associated byte sequence, with itssize
set to the length of bytes, and itstype
set to the value of t from the substeps above.
3.1.1. Constructor Parameters
The Blob()
constructor can be invoked with the parameters below:
- A
blobParts
sequence
-
which takes any number of the following types of elements, and in any order:
-
BufferSource
elements. -
Blob
elements. -
USVString
elements.
-
- An optional
BlobPropertyBag
-
which takes these optional members:
-
type
, the ASCII-encoded string in lower case representing the media type of theBlob
. Normative conditions for this member are provided in the §3.1 Constructors. -
endings
, an enum which can take the values"transparent"
or"native"
. By default this is set to"transparent"
. If set to"native"
, line endings will be converted to native in anyUSVString
elements inblobParts
.
-
BlobPart
's parts and BlobPropertyBag
options,
run the following steps:
-
Let bytes be an empty sequence of bytes.
-
For each element in parts:
-
If element is a
USVString
, run the following substeps:-
Let s be element.
-
If the
endings
member of options is"native"
, set s to the result of converting line endings to native of element. -
Append the result of UTF-8 encoding s to bytes.
Note: The algorithm from WebIDL [WebIDL] replaces unmatched surrogates in an invalid utf-16 string with U+FFFD replacement characters. Scenarios exist when the
Blob
constructor may result in some data loss due to lost or scrambled character sequences.
-
-
If element is a
BufferSource
, get a copy of the bytes held by the buffer source, and append those bytes to bytes. -
If element is a
Blob
, append the bytes it represents to bytes.Note: The
type
of theBlob
array element is ignored and will not affecttype
of returnedBlob
object.
-
-
Return bytes.
-
Let native line ending be be the code point U+000A LF.
-
If the underlying platform’s conventions are to represent newlines as a carriage return and line feed sequence, set native line ending to the code point U+000D CR followed by the code point U+000A LF.
-
Set result to the empty string.
-
Let position be a position variable for s, initially pointing at the start of s.
-
Let token be the result of collecting a sequence of code points that are not equal to U+000A LF or U+000D CR from s given position.
-
Append token to result.
-
While position is not past the end of s:
-
If the code point at position within s equals U+000D CR:
-
Append native line ending to result.
-
Advance position by 1.
-
If position is not past the end of s and the code point at position within s equals U+000A LF advance position by 1.
-
-
Otherwise if the code point at position within s equals U+000A LF, advance position by 1 and append native line ending to result.
-
Let token be the result of collecting a sequence of code points that are not equal to U+000A LF or U+000D CR from s given position.
-
Append token to result.
-
-
Return result.
// Create a new Blob object var a = new Blob(); // Create a 1024-byte ArrayBuffer // buffer could also come from reading a File var buffer = new ArrayBuffer(1024); // Create ArrayBufferView objects based on buffer var shorts = new Uint16Array(buffer, 512, 128); var bytes = new Uint8Array(buffer, shorts.byteOffset + shorts.byteLength); var b = new Blob(["foobarbazetcetc" + "birdiebirdieboo"], {type: "text/plain;charset=utf-8"}); var c = new Blob([b, shorts]); var a = new Blob([b, c, bytes]); var d = new Blob([buffer, b, c, bytes]);
3.2. Attributes
size
, of type unsigned long long, readonly- Returns the size of the byte sequence in number of bytes.
On getting, conforming user agents must return the total number of bytes that can be read by a
FileReader
orFileReaderSync
object, or 0 if theBlob
has no bytes to be read. type
, of type DOMString, readonly-
The ASCII-encoded string in lower case representing the media type of the
Blob
. On getting, user agents must return the type of aBlob
as an ASCII-encoded string in lower case, such that when it is converted to a byte sequence, it is a parsable MIME type, or the empty string – 0 bytes – if the type cannot be determined.The
type
attribute can be set by the web application itself through constructor invocation and through theslice()
call; in these cases, further normative conditions for this attribute are in §3.1 Constructors, §4.1 Constructor, and §3.3.1 The slice method respectively. User agents can also determine thetype
of aBlob
, especially if the byte sequence is from an on-disk file; in this case, further normative conditions are in the file type guidelines.Note: The type t of a
Blob
is considered a parsable MIME type, if performing the parse a MIME type algorithm to a byte sequence converted from the ASCII-encoded string representing the Blob object’s type does not return undefined.Note: Use of the
type
attribute informs the encoding determination and determines theContent-Type
header when fetching blob URLs.
3.3. Methods and Parameters
3.3.1. The slice method
The slice()
method
returns a new Blob
object with bytes ranging from
the optional start
parameter
up to but not including the optional end
parameter,
and with a type
attribute that is the value of the optional contentType
parameter.
It must act as follows:
-
Let O be the
Blob
context object on which theslice()
method is being called. -
The optional
start
parameter is a value for the start point of aslice()
call, and must be treated as a byte-order position, with the zeroth position representing the first byte. User agents must processslice()
withstart
normalized according to the following: -
The optional
end
parameter is a value for the end point of aslice()
call. User agents must processslice()
withend
normalized according to the following: -
The optional
contentType
parameter is used to set the ASCII-encoded string in lower case representing the media type of the Blob. User agents must process theslice()
withcontentType
normalized according to the following:- If the
contentType
parameter is not provided, let relativeContentType be set to the empty string. -
Else let relativeContentType be set to
contentType
and run the substeps below:-
If relativeContentType contains any characters outside the range of U+0020 to U+007E, then set relativeContentType to the empty string and return from these substeps.
-
Convert every character in relativeContentType to ASCII lowercase.
-
- If the
-
Let span be
max((relativeEnd - relativeStart), 0)
. -
Return a new
Blob
object S with the following characteristics:
slice()
calls possible. Since the File
interface inherits from the Blob
interface, examples are based on the use of the File
interface.
// obtain input element through DOM var file = document.getElementById('file').files[0]; if(file) { // create an identical copy of file // the two calls below are equivalent var fileClone = file.slice(); var fileClone2 = file.slice(0, file.size); // slice file into 1/2 chunk starting at middle of file // Note the use of negative number var fileChunkFromEnd = file.slice(-(Math.round(file.size/2))); // slice file into 1/2 chunk starting at beginning of file var fileChunkFromStart = file.slice(0, Math.round(file.size/2)); // slice file from beginning till 150 bytes before end var fileNoMetadata = file.slice(0, -150, "application/experimental"); }
4. The File Interface
A File
object is a Blob
object with a name
attribute, which is a string;
it can be created within the web application via a constructor,
or is a reference to a byte sequence from a file from the underlying (OS) file system.
If a File
object is a reference to a byte sequence originating from a file on disk,
then its snapshot state should be set to the state of the file on disk at the time the File
object is created.
Note: This is a non-trivial requirement to implement for user agents,
and is thus not a must but a should [RFC2119].
User agents should endeavor to have a File
object’s snapshot state set to the state of the underlying storage on disk at the time the reference is taken.
If the file is modified on disk following the time a reference has been taken,
the File
's snapshot state will differ from the state of the underlying storage.
User agents may use modification time stamps and other mechanisms to maintain snapshot state,
but this is left as an implementation detail.
When a File
object refers to a file on disk,
user agents must return the type
of that file,
and must follow the file type guidelines below:
-
User agents must return the
type
as an ASCII-encoded string in lower case, such that when it is converted to a corresponding byte sequence, it is a parsable MIME type, or the empty string – 0 bytes – if the type cannot be determined. -
When the file is of type
text/plain
user agents must NOT append a charset parameter to the dictionary of parameters portion of the media type [MIMESNIFF]. -
User agents must not attempt heuristic determination of encoding, including statistical methods.
[
Constructor
(sequence<BlobPart> fileBits, USVString fileName, optional FilePropertyBagoptions
), Exposed=(Window,Worker), Serializable] interfaceFile
: Blob { readonly attribute DOMString name; readonly attribute long long lastModified; }; dictionaryFilePropertyBag
: BlobPropertyBag { long long lastModified; };
File
objects are serializable objects. Their serialization steps,
given value and serialized, are:
-
Set serialized.[[SnapshotState]] to value’s snapshot state.
-
Set serialized.[[ByteSequence]] to value’s underlying byte sequence.
-
Set serialized.[[Name]] to the value of value’s
name
attribute. -
Set serialized.[[LastModified]] to the value of value’s
lastModified
attribute.
Their deserialization steps, given value and serialized, are:
-
Set value’s snapshot state to serialized.[[SnapshotState]].
-
Set value’s underlying byte sequence to serialized.[[ByteSequence]].
-
Initialize the value of value’s
name
attribute to serialized.[[Name]]. -
Initialize the value of value’s
lastModified
attribute to serialized.[[LastModified]].
4.1. Constructor
File
constructor is invoked with two or three parameters,
depending on whether the optional dictionary parameter is used.
When the File()
constructor is invoked,
user agents must run the following steps:
-
Let bytes be the result of processing blob parts given
fileBits
andoptions
. -
Let n be a new string of the same size as the
fileName
argument to the constructor. Copy every character fromfileName
to n, replacing any "/" character (U+002F SOLIDUS) with a ":" (U+003A COLON).Note: Underlying OS filesystems use differing conventions for file name; with constructed files, mandating UTF-16 lessens ambiquity when file names are converted to byte sequences.
-
If the optional
FilePropertyBag
dictionary argument is used, then run the following substeps:-
If the
type
member is provided and is not the empty string, let t be set to thetype
dictionary member. If t contains any characters outside the range U+0020 to U+007E, then set t to the empty string and return from these substeps. -
Convert every character in t to ASCII lowercase.
-
If the
lastModified
member is provided, let d be set to thelastModified
dictionary member. If it is not provided, set d to the current date and time represented as the number of milliseconds since the Unix Epoch (which is the equivalent ofDate.now()
[ECMA-262]).Note: Since ECMA-262
Date
objects convert tolong long
values representing the number of milliseconds since the Unix Epoch, thelastModified
member could be aDate
object [ECMA-262].
-
-
Return a new
File
object F such that:-
F refers to the bytes byte sequence.
-
F.
size
is set to the number of total bytes in bytes. -
F.
name
is set to n. -
F.
type
is set to t. -
F.
lastModified
is set to d.
-
4.1.1. Constructor Parameters
The File()
constructor can be invoked with the parameters below:
- A
fileBits
sequence
-
which takes any number of the following elements, and in any order:
-
BufferSource
elements. -
USVString
elements.
-
- A
fileName
parameter - A
USVString
parameter representing the name of the file; normative conditions for this constructor parameter can be found in §4.1 Constructor. - An optional
FilePropertyBag
dictionary -
which in addition to the members of
BlobPropertyBag
takes one member:-
An optional
lastModified
member, which must be along long
; normative conditions for this member are provided in §4.1 Constructor.
-
4.2. Attributes
name
, of type DOMString, readonly- The name of the file.
On getting, this must return the name of the file as a string.
There are numerous file name variations and conventions used by different underlying OS file systems;
this is merely the name of the file, without path information.
On getting, if user agents cannot make this information available,
they must return the empty string.
If a
File
object is created using a constructor, further normative conditions for this attribute are found in §4.1 Constructor. lastModified
, of type long long, readonly- The last modified date of the file.
On getting, if user agents can make this information available,
this must return a
long long
set to the time the file was last modified as the number of milliseconds since the Unix Epoch. If the last modification date and time are not known, the attribute must return the current date and time as along long
representing the number of milliseconds since the Unix Epoch; this is equivalent toDate.now()
[ECMA-262]. If aFile
object is created using a constructor, further normative conditions for this attribute are found in §4.1 Constructor.
The File
interface is available on objects that expose an attribute of type FileList
;
these objects are defined in HTML [HTML].
The File
interface, which inherits from Blob
, is immutable,
and thus represents file data that can be read into memory at the time a read operation is initiated.
User agents must process reads on files that no longer exist at the time of read as errors,
throwing a NotFoundError
exception
if using a FileReaderSync
on a Web Worker [Workers] or firing an error
event
with the error
attribute returning a NotFoundError
.
var file = document.getElementById("filePicker").files[0]; var date = new Date(file.lastModified); println("You selected the file " + file.name + " which was modified on " + date.toDateString() + "."); ... // Generate a file with a specific last modified date var d = new Date(2013, 12, 5, 16, 23, 45, 600); var generatedFile = new File(["Rough Draft ...."], "Draft1.txt", {type: "text/plain", lastModified: d}) ...
5. The FileList Interface
Note: The FileList
interface should be considered "at risk"
since the general trend on the Web Platform is to replace such interfaces
with the Array
platform object in ECMAScript [ECMA-262].
In particular, this means syntax of the sort filelist.item(0)
is at risk;
most other programmatic use of FileList
is unlikely to be affected by the eventual migration to an Array
type.
This interface is a list of File
objects.
[Exposed=(Window,Worker), Serializable]
interface FileList
{
getter File? item(unsigned long index);
readonly attribute unsigned long length;
};
FileList
objects are serializable objects. Their serialization steps,
given value and serialized, are:
-
Set serialized.[[Files]] to an empty list.
-
For each file in value, append the sub-serialization of file to serialized.[[Files]].
Their deserialization step, given serialized and value, are:
-
For each file of serialized.[[Files]], add the sub-deserialization of file to value.
<input type="file">
element within a form,
and then accessing selected files.
// uploadData is a form element // fileChooser is input element of type 'file' var file = document.forms['uploadData']['fileChooser'].files[0]; // alternative syntax can be // var file = document.forms['uploadData']['fileChooser'].files.item(0); if(file) { // Perform file ops }
5.1. Attributes
length
, of type unsigned long, readonly- must return the number of files in the
FileList
object. If there are no files, this attribute must return 0.
5.2. Methods and Parameters
item(index)
-
must return the indexth
File
object in theFileList
. If there is no indexthFile
object in theFileList
, then this method must returnnull
.index
must be treated by user agents as value for the position of aFile
object in theFileList
, with 0 representing the first file. Supported property indices are the numbers in the range zero to one less than the number ofFile
objects represented by theFileList
object. If there are no suchFile
objects, then there are no supported property indices.
Note: The HTMLInputElement
interface has a readonly attribute of type FileList
,
which is what is being accessed in the above example.
Other interfaces with a readonly attribute of type FileList
include the DataTransfer
interface.
6. Reading Data
6.1. The Read Operation
The algorithm below defines a read operation,
which takes a Blob
and a synchronous flag as input,
and reads bytes into a byte stream
which is returned as the result of the read operation,
or else fails along with a failure reason.
Methods in this specification invoke the read operation with the synchronous flag either set or unset.
The synchronous flag determines if a read operation is synchronous or asynchronous, and is unset by default. Methods may set it. If it is set, the read operation takes place synchronously. Otherwise, it takes place asynchronously.
To perform a read operation on a Blob
and the synchronous flag,
run the following steps:
-
Let s be a a new body, b be the
Blob
to be read from, and bytes initially set to an empty byte sequence. Set the length on s to thesize
of b. While there are still bytes to be read in b, perform the following substeps:-
If the synchronous flag is set, follow the steps below:
-
Let bytes be the byte sequence that results from reading a chunk from b. If a file read error occurs reading a chunk from b, return s with the error flag set, along with a failure reason, and terminate this algorithm.
Note: Along with returning failure, the synchronous part of this algorithm must return the failure reason that occurred for throwing an exception by synchronous methods that invoke this algorithm with the synchronous flag set.
-
If there are no errors, push bytes to s, and increment s’s transmitted [Fetch] by the number of bytes in bytes. Reset bytes to the empty byte sequence and continue reading chunks as above.
-
When all the bytes of b have been read into s, return s and terminate this algorithm.
-
-
Otherwise, the synchronous flag is unset. Return s and process the rest of this algorithm asynchronously.
-
Let bytes be the byte sequence that results from reading a chunk from b. If a file read error occurs reading a chunk from b, set the error flag on s, and terminate this algorithm with a failure reason.
Note: The asynchronous part of this algorithm must signal the failure reason that occurred for asynchronous error reporting by methods expecting s and which invoke this algorithm with the synchronous flag unset.
-
If no file read error occurs, push bytes to s, and increment s’s transmitted [Fetch] by the number of bytes in bytes. Reset bytes to the empty byte sequence and continue reading chunks as above.
-
To perform an annotated task read operation on a Blob
b,
perform the steps below:
-
Perform a read operation on b with the synchronous flag unset, along with the additional steps below.
-
If the read operation terminates with a failure reason, queue a task to process read error with the failure reason and terminate this algorithm.
-
When the first chunk is being pushed to the body s during the read operation, queue a task to process read.
-
Once the body s from the read operation has at least one chunk read into it, or there are no chunks left to read from b, queue a task to process read data. Keep queuing tasks to process read data for every chunk read or every 50ms, whichever is least frequent.
-
When all of the chunks from b are read into the body s from the read operation, queue a task to process read EOF.
Use the file reading task source for all these tasks.
6.2. The File Reading Task Source
This specification defines a new generic task source called the file reading task source,
which is used for all tasks that are queued in this specification
to read byte sequences associated with Blob
and File
objects.
It is to be used for features that trigger in response to asynchronously reading binary data.
6.3. The FileReader API
[Constructor
, Exposed=(Window,Worker)] interfaceFileReader
: EventTarget { // async read methods void readAsArrayBuffer(Blob blob); void readAsBinaryString(Blob blob); void readAsText(Blob blob, optional DOMString label); void readAsDataURL(Blob blob); void abort(); // states const unsigned short EMPTY = 0; const unsigned short LOADING = 1; const unsigned short DONE = 2; readonly attribute unsigned short readyState; // File or Blob data readonly attribute (DOMString or ArrayBuffer)? result; readonly attribute DOMException? error; // event handler content attributes attribute EventHandler onloadstart; attribute EventHandler onprogress; attribute EventHandler onload; attribute EventHandler onabort; attribute EventHandler onerror; attribute EventHandler onloadend; };
6.3.1. Constructor
When the FileReader()
constructor is invoked,
the user agent must return a new FileReader
object.
In environments where the global object is represented by a Window
or a WorkerGlobalScope
object,
the FileReader
constructor must be available.
6.3.2. Event Handler Content Attributes
The following are the event handler content attributes (and their corresponding event handler event types)
that user agents must support on FileReader
as DOM attributes:
event handler content attribute | event handler event type |
---|---|
onloadstart
| loadstart
|
onprogress
| progress
|
onabort
| abort
|
onerror
| error
|
onload
| load
|
onloadend
| loadend
|
6.3.3. FileReader States
The FileReader
object can be in one of 3 states.
The readyState
attribute,
on getting,
must return the current state,
which must be one of the following values:
EMPTY
(numeric value 0)- The
FileReader
object has been constructed, and there are no pending reads. None of the read methods have been called. This is the default state of a newly mintedFileReader
object, until one of the read methods have been called on it. LOADING
(numeric value 1)- A
File
orBlob
is being read. One of the read methods is being processed, and no error has occurred during the read. DONE
(numeric value 2)- The entire
File
orBlob
has been read into memory, OR a file read error occurred, OR the read was aborted usingabort()
. TheFileReader
is no longer reading aFile
orBlob
. IfreadyState
is set toDONE
it means at least one of the read methods have been called on thisFileReader
.
6.3.4. Reading a File or Blob
The FileReader
interface makes available several asynchronous read methods—readAsArrayBuffer()
, readAsBinaryString()
, readAsText()
and readAsDataURL()
,
which read files into memory.
If multiple concurrent read methods are called on the same FileReader
object,
user agents must throw an InvalidStateError
on any of the read methods that occur
when readyState
= LOADING
.
(FileReaderSync
makes available several synchronous read methods.
Collectively, the sync and async read methods of FileReader
and FileReaderSync
are referred to as just read methods.)
6.3.4.1. The result
attribute
On getting, the result
attribute returns a Blob
's data
as a DOMString
, or as an ArrayBuffer
, or null
,
depending on the read method that has been called on the FileReader
,
and any errors that may have occurred.
The list below is normative for the result
attribute
and is the conformance criteria for this attribute:
-
On getting, if the
readyState
isEMPTY
(no read method has been called) then theresult
attribute must returnnull
. -
On getting, if an error in reading the
File
orBlob
has occurred (using any read method) then theresult
attribute must returnnull
. -
On getting, if the
readAsDataURL()
read method is used, theresult
attribute must return aDOMString
that is a Data URL [RFC2397] encoding of theFile
orBlob
's data. -
On getting, if the
readAsBinaryString()
read method is called and no error in reading theFile
orBlob
has occurred, then theresult
attribute must return aDOMString
representing theFile
orBlob
's data as a binary string, in which every byte is represented by a code unit of equal value [0...255]. -
On getting, if the
readAsText()
read method is called and no error in reading theFile
orBlob
has occurred, then theresult
attribute must return a string representing theFile
orBlob
's data as a text string, and should decode the string into memory in the format specified by the encoding determination as aDOMString
. -
On getting, if the
readAsArrayBuffer()
read method is called and no error in reading theFile
orBlob
has occurred, then theresult
attribute must return anArrayBuffer
object.
6.3.4.2. The readAsDataURL()
method
When the readAsDataURL(blob)
method is called,
the user agent must run the steps below.
-
If
readyState
=LOADING
throw anInvalidStateError
exception and terminate this algorithm. -
Otherwise set
readyState
toLOADING
. -
Initiate an annotated task read operation using the
blob
argument as input and handle tasks queued on the file reading task source per below. -
To process read error with a failure reason, proceed to §6.3.4.6 Error Steps.
-
To process read fire a progress event called
loadstart
at the context object. -
To process read data fire a progress event called
progress
at the context object. -
To process read EOF run these substeps:
-
Set
readyState
toDONE
. -
Set the
result
attribute to the body returned by the read operation as a DataURL [RFC2397]; on getting, theresult
attribute returns theblob
as a Data URL [RFC2397].-
Use the
blob
’stype
attribute as part of the Data URL if it is available in keeping with the Data URL specification [RFC2397]. -
If the
type
attribute is not available on theblob
return a Data URL without a media-type. [RFC2397]. Data URLs that do not have media-types [RFC2046] must be treated as plain text by conforming user agents. [RFC2397].
-
-
Fire a progress event called
load
at the context object. -
Unless
readyState
isLOADING
fire a progress event calledloadend
at the context object. IfreadyState
isLOADING
do NOT fireloadend
at the context object.
-
6.3.4.3. The readAsText()
method
The readAsText()
method can be called with an optional parameter, label
,
which is a DOMString
argument that represents the label of an encoding [Encoding];
if provided, it must be used as part of the encoding determination used when processing this method call.
When the readAsText(blob, label)
method is called,
the user agent must run the steps below.
-
If
readyState
=LOADING
throw anInvalidStateError
and terminate this algorithm. -
Otherwise set
readyState
toLOADING
. -
Initiate an annotated task read operation using the
blob
argument as input and handle tasks queued on the file reading task source per below. -
To process read error with a failure reason, proceed to the §6.3.4.6 Error Steps.
-
To process read fire a progress event called
loadstart
at the context object. -
To process read data fire a progress event called
progress
at the context object. -
To process read EOF run these substeps:
-
Set
readyState
toDONE
-
Set the
result
attribute to the body returned by the read operation, represented as a string in a format determined by the encoding determination. -
Fire a progress event called
load
at the context object. -
Unless
readyState
isLOADING
fire a progress event calledloadend
at the context object. IfreadyState
isLOADING
do NOT fireloadend
at the context object.
-
6.3.4.4. The readAsArrayBuffer()
method
When the readAsArrayBuffer(blob)
method is called,
the user agent must run the steps below.
-
If
readyState
=LOADING
throw anInvalidStateError
exception and terminate this algorithm. -
Otherwise set
readyState
toLOADING
. -
Initiate an annotated task read operation using the
blob
argument as input and handle tasks queued on the file reading task source per below. -
To process read error with a failure reason, proceed to the §6.3.4.6 Error Steps.
-
To process read fire a progress event called
loadstart
at the context object. -
To process read data fire a progress event called
progress
at the context object. -
To process read EOF run these substeps:
-
Set
readyState
toDONE
-
Set the
result
attribute to the body returned by the read operation as anArrayBuffer
object. -
Fire a progress event called
load
at the context object. -
Unless
readyState
isLOADING
fire a progress event calledloadend
at the context object. IfreadyState
isLOADING
do NOT fireloadend
at the context object.
-
6.3.4.5. The readAsBinaryString()
method
When the readAsBinaryString(blob)
method is called,
the user agent must run the steps below.
-
If
readyState
=LOADING
throw anInvalidStateError
exception and terminate this algorithm. -
Otherwise set
readyState
toLOADING
. -
Initiate an annotated task read operation using the
blob
argument as input and handle tasks queued on the file reading task source per below. -
To process read error with a failure reason, proceed to the §6.3.4.6 Error Steps.
-
To process read fire a progress event called
loadstart
at the context object. -
To process read data fire a progress event called
progress
at the context object. -
To process read EOF run these substeps:
-
Set
readyState
toDONE
-
Set the
result
attribute to the body returned by the read operation as a binary string. -
Fire a progress event called
load
at the context object. -
Unless
readyState
isLOADING
fire a progress event calledloadend
at the context object. IfreadyState
isLOADING
do NOT fireloadend
at the context object.
-
readAsArrayBuffer()
is preferred over readAsBinaryString()
, which is provided for backwards
compatibility. 6.3.4.6. Error Steps
These error steps are to process read error with a failure reason.
-
Set the context object’s
readyState
toDONE
andresult
to null if it is not already set to null. -
Set the
error
attribute on the context object; on getting, theerror
attribute must be a aDOMException
object that corresponds to the failure reason. Fire a progress event callederror
at the context object. -
Unless
readyState
isLOADING
, fire a progress event calledloadend
at the context object. IfreadyState
isLOADING
do NOT fireloadend
at the context object. -
Terminate the algorithm for any read method.
6.3.4.7. The abort() method
When the abort()
method is called,
the user agent must run the steps below:
-
If
readyState
=EMPTY
or ifreadyState
=DONE
setresult
tonull
and terminate this algorithm. -
If
readyState
=LOADING
setreadyState
toDONE
andresult
tonull
. -
If there are any tasks from the context object on the file reading task source in an affiliated task queue, then remove those tasks from that task queue.
-
Terminate the algorithm for the read method being processed.
-
Fire a progress event called
abort
. -
Fire a progress event called
loadend
.
6.3.4.8. Blob Parameters
The asynchronous read methods,
the synchronous read methods, and URL.
take a createObjectURL()
Blob
parameter.
This section defines this parameter.
blob
- This is a
Blob
argument and must be a reference to a singleFile
in aFileList
or aBlob
argument not obtained from the underlying OS file system.
6.4. Determining Encoding
When reading Blob
objects using the readAsText()
read method,
the following encoding determination steps must be followed:
-
Let encoding be null.
-
If the
label
argument is present when calling the method, set encoding to the result of the getting an encoding fromlabel
. -
If the getting an encoding steps above return failure, then set encoding to null.
-
If encoding is null, and the
blob
argument’stype
attribute is present, and it uses a Charset Parameter [RFC2046], set encoding to the result of getting an encoding for the portion of the Charset Parameter that is a label of an encoding.Ifblob
has atype
attribute oftext/plain;charset=utf-8
then getting an encoding is run using"utf-8"
as the label. Note that user agents must parse and extract the portion of the Charset Parameter that constitutes a label of an encoding. -
If the getting an encoding steps above return failure, then set encoding to null.
-
If encoding is null, then set encoding to utf-8.
-
Decode this
blob
using fallback encoding encoding, and return the result. On getting, theresult
attribute of theFileReader
object returns a string in encoding format. The synchronousreadAsText()
method of theFileReaderSync
object returns a string in encoding format.
6.5. Events
The FileReader
object must be the event target for all events in this specification.
When this specification says to fire a progress event called e (for some ProgressEvent
e
at a given FileReader
reader
as the context object),
the following are normative:
-
The progress event
e
does not bubble.e.bubbles
must be false [DOM] -
The progress event
e
is NOT cancelable.e.cancelable
must be false [DOM]
6.5.1. Event Summary
The following are the events that are fired at FileReader
objects.
Event name | Interface | Fired when… |
---|---|---|
loadstart
| ProgressEvent
| When the read starts. |
progress
| ProgressEvent
| While reading (and decoding) blob
|
abort
| ProgressEvent
| When the read has been aborted.
For instance, by invoking the abort() method.
|
error
| ProgressEvent
| When the read has failed (see file read errors). |
load
| ProgressEvent
| When the read has successfully completed. |
loadend
| ProgressEvent
| When the request has completed (either in success or failure). |
6.5.2. Summary of Event Invariants
This section is informative.
The following are invariants applicable to event firing for a given asynchronous read method in this specification:
-
Once a
loadstart
has been fired, a correspondingloadend
fires at completion of the read, UNLESS any of the following are true:-
the read method has been cancelled using
abort()
and a new read method has been invoked -
the event handler function for a
load
event initiates a new read -
the event handler function for a
error
event initiates a new read.
Note: The events
loadstart
andloadend
are not coupled in a one-to-one manner.This example showcases "read-chaining": initiating another read from within an event handler while the "first" read continues processing.// In code of the sort... reader.readAsText(file); reader.onload = function(){reader.readAsText(alternateFile);} ..... //... the loadend event must not fire for the first read reader.readAsText(file); reader.abort(); reader.onabort = function(){reader.readAsText(updatedFile);} //... the loadend event must not fire for the first read
-
-
One
progress
event will fire whenblob
has been completely read into memory. -
No
progress
event fires after any one ofabort
,load
, anderror
have fired. At most one ofabort
,load
, anderror
fire for a given read.
6.6. Reading on Threads
Web Workers allow for the use of synchronous File
or Blob
read APIs,
since such reads on threads do not block the main thread.
This section defines a synchronous API, which can be used within Workers [[Web Workers]].
Workers can avail of both the asynchronous API (the FileReader
object) and the synchronous API (the FileReaderSync
object).
6.6.1. The FileReaderSync
API
This interface provides methods to synchronously read File
or Blob
objects into memory.
[Constructor
, Exposed=(DedicatedWorker,SharedWorker)] interfaceFileReaderSync
{ // Synchronously return strings ArrayBuffer readAsArrayBuffer(Blob blob); DOMString readAsBinaryString(Blob blob); DOMString readAsText(Blob blob, optional DOMString label); DOMString readAsDataURL(Blob blob); };
6.6.1.1. Constructors
When the FileReaderSync()
constructor is invoked,
the user agent must return a new FileReaderSync
object.
In environments where the global object is represented by a WorkerGlobalScope
object,
the FileReaderSync
constructor must be available.
6.6.1.2. The readAsText()
method
When the readAsText(blob, label)
method is called,
the following steps must be followed:
-
If
readyState
=LOADING
throw anInvalidStateError
exception and terminate this algorithm. -
Otherwise, initiate a read operation using the
blob
argument, and with the synchronous flag set. If the read operation returns failure, throw the appropriate exception as defined in §7.1 Throwing an Exception or Returning an Error. Terminate this algorithm. -
If no error has occurred, return the result of the read operation represented as a string in a format determined through the encoding determination algorithm.
6.6.1.3. The readAsDataURL()
method
When the readAsDataURL(blob)
method is called,
the following steps must be followed:
-
If
readyState
=LOADING
throw anInvalidStateError
exception and terminate this algorithm. -
Otherwise, initiate a read operation using the
blob
argument, and with the synchronous flag set. If the read operation returns failure, throw the appropriate exception as defined in §7.1 Throwing an Exception or Returning an Error. Terminate this algorithm. -
If no error has occurred, return the result of the read operation as a Data URL [RFC2397] subject to the considerations below:
-
Use the
blob
’stype
attribute as part of the Data URL if it is available in keeping with the Data URL specification [RFC2397]. -
If the
type
attribute is not available on theblob
return a Data URL without a media-type. [RFC2397]. Data URLs that do not have media-types [RFC2046] must be treated as plain text by conforming user agents. [RFC2397].
-
6.6.1.4. The readAsArrayBuffer()
method
When the readAsArrayBuffer(blob)
method is called,
the following steps must be followed:
-
If
readyState
=LOADING
throw anInvalidStateError
exception and terminate this algorithm. -
Otherwise, initiate a read operation using the
blob
argument, and with the synchronous flag set. If the read operation returns failure, throw the appropriate exception as defined in §7.1 Throwing an Exception or Returning an Error. Terminate this algorithm. -
If no error has occurred, return the result of the read operation as an
ArrayBuffer
.
6.6.1.5. The readAsBinaryString()
method
When the readAsBinaryString(blob)
method is called,
the following steps must be followed:
-
If
readyState
=LOADING
throw anInvalidStateError
exception and terminate this algorithm. -
Otherwise, initiate a read operation using the
blob
argument, and with the synchronous flag set. If the read operation returns failure, throw the appropriate exception as defined in §7.1 Throwing an Exception or Returning an Error. Terminate this algorithm. -
If no error has occurred, return the result of the read operation as an binary string.
readAsArrayBuffer()
is preferred over readAsBinaryString()
, which is provided for
backwards compatibility. 7. Errors and Exceptions
File read errors can occur when reading files from the underlying filesystem. The list below of potential error conditions is informative.
-
The
File
orBlob
being accessed may not exist at the time one of the asynchronous read methods or synchronous read methods are called. This may be due to it having been moved or deleted after a reference to it was acquired (e.g. concurrent modification with another application). SeeNotFoundError
. -
A
File
orBlob
may be unreadable. This may be due to permission problems that occur after a reference to aFile
orBlob
has been acquired (e.g. concurrent lock with another application). Additionally, the snapshot state may have changed. SeeNotReadableError
. -
User agents MAY determine that some files are unsafe for use within Web applications. A file may change on disk since the original file selection, thus resulting in an invalid read. Additionally, some file and directory structures may be considered restricted by the underlying filesystem; attempts to read from them may be considered a security violation. See §9 Security and Privacy Considerations and
SecurityError
.
7.1. Throwing an Exception or Returning an Error
This section is normative.
Error conditions can arise when reading a File
or a Blob
.
The read operation can terminate due to error conditions when reading a File
or a Blob
;
the particular error condition that causes a read operation to return failure
or queue a task to process read error is called a failure reason.
Synchronous read methods throw exceptions of the type in the table below if there has been an error owing to a particular failure reason.
Asynchronous read methods use the error
attribute of the FileReader
object,
which must return a DOMException
object of the most appropriate type from the table below
if there has been an error owing to a particular failure reason,
or otherwise return null.
Type | Description and Failure Reason |
---|---|
NotFoundError
|
If the File or Blob resource could not be found at the time the read was processed,
this is the NotFound failure reason.
For asynchronous read methods the |
SecurityError
|
If:
For asynchronous read methods the This is a security error to be used in situations not covered by any other failure reason. |
NotReadableError
|
If:
For asynchronous read methods the |
8. A URL for Blob and File reference
This section defines a scheme for a URL used to refer to Blob
objects (and File
objects).
Note: other specifications, such as [MEDIA-SOURCE] extend this scheme to also refer to other types of objects.
8.1. Introduction
This section is informative.
Blob (or object) URLs are URLs like blob:http://example.com/550e8400-e29b-41d4-a716-446655440000
.
This enables integration of Blob
s and File
s with other Web Platform APIs
that are only designed to be used with URLs, such as the img
element. Blob URLs can also be used to navigate to as well as to trigger downloads of
locally generated data.
For this purpose two static methods are exposed on the URL
interface, createObjectURL(blob)
and revokeObjectURL(url)
.
The first method creates a mapping from a URL to a Blob
,
and the second method revokes said mapping.
As long as the mapping exist the Blob
can’t be garbage collected,
so some care must be taken to revoke the URL as soon as the reference is no longer needed.
All URLs are revoked when the global that created the URL itself goes away.
8.2. Model
Each user agent must maintain a blob URL store. A blob URL store is a map where keys are valid URL strings and values are blob URL Entries.
A blob URL entry consists of
an object (typically a Blob
,
but other specs can extend this to refer to other types of objects),
and an environment (an environment settings object).
Keys in the blob URL store (also known as blob URLs)
are valid URL strings that when parsed result in a URL with a scheme equal to "blob
",
an empty host, and a path consisting of one element itself also a valid URL string.
-
Let result be the empty string.
-
Append the string "
blob:
" to result. -
Let settings be the current settings object
-
Let origin be settings’s origin.
-
Let serialized be the ASCII serialization of origin.
-
If serialized is "
null
", set it to an implementation-defined value. -
Append serialized to result.
-
Append U+0024 SOLIDUS (
/
) to result. -
Generate a UUID [RFC4122] as a string and append it to result.
-
Return result.
blob:https://example.org/9115d58c-bcda-ff47-86e5-083e9a215304
1. -
Let store be the user agent’s blob URL store.
-
Let url be the result of generating a new blob URL.
-
Let entry be a new blob URL entry consisting of object and the current settings object.
-
Set store[url] to entry.
-
Return url.
-
Let store be the user agent’s blob URL store;
-
Let url string be the result of serializing url.
-
Remove store[url string].
8.2.1. Dereferencing Model for blob URLs
-
Let store be the user agent’s blob URL store.
-
Let url string be the result of serializing url with the exclude fragment flag set.
-
If store[url string] exists, return store[url string]; otherwise return failure.
Futher requirements for the parsing an fetching model for blob URLs are defined in the [URL] and [Fetch] specifications.
8.2.2. Origin of blob URLs
-
Let entry be the result of resolving url.
-
If entry is not failure, return entry’s environment's origin.
-
Return a new opaque origin, if nested url is failure, and nested url’s origin otherwise.
Note: The effect of this algorithm is that the origin of a blob URL is always the same as that of the environment that created the URL, as long as the URL hasn’t been revoked yet. If the URL was revoked the serialization of the origin will still remain the same as the serialization of the origin of the environment that created the blob URL, but for opaque origins the origin itself might be distinct. This difference isn’t observable though, since a revoked blob URL can’t be resolved/fetched anymore anyway.
The [URL] spec should be updated to refer to this algorithm to resolve the origin of a blob URL when the URL is first parsed. This is tracked in issue #63 and in whatwg/url#127.
8.2.3. Lifetime of blob URLs
This specification extends the unloading document cleanup steps with the following steps:
-
Let environment be the
Document
's relevant settings object. -
Let store be the user agent’s blob URL store;
-
Remove from store any entries for which the value's environment is equal to environment.
This needs a similar hook when a worker is unloaded.
8.3. Creating and Revoking a blob URL
Blob URLs are created and revoked using static methods exposed on the URL
object.
Revocation of a blob URL decouples the blob URL from the resource it refers to,
and if it is dereferenced after it is revoked,
user agents must act as if a network error has occurred.
This section describes a supplemental interface to the URL specification [URL] and presents methods for blob URL creation and revocation.
[Exposed=(Window,DedicatedWorker,SharedWorker)]
partial interface URL {
static DOMString createObjectURL(Blob blob);
static void revokeObjectURL(DOMString url
);
};
createObjectURL(blob)
static method must
return the result of adding an entry to the blob URL store for blob. revokeObjectURL(url)
static method must run these steps:
-
Let url record be the result of parsing url.
-
If url record’s scheme is not "
blob
", return. -
Let origin be the result of resolving the origin of url record.
-
Let settings be the current settings object.
-
If origin is not same origin with settings’s origin, return.
Note: This means that rather than throwing some kind of error, attempting to revoke a URL that isn’t registered will silently fail. User agents might display a message on the error console is this happens.
Note: Attempts to dereference url after it has been revoked will result in a network error. Requests that were started before the url was revoked should still succeed.
window1
and window2
are separate,
but in the same origin; window2
could be an iframe
inside window1
.
myurl = window1.URL.createObjectURL(myblob); window2.URL.revokeObjectURL(myurl);
Since a user agent has one global blob URL store,
it is possible to revoke an object URL from a different window than from which it was created.
The URL.
call
ensures that subsequent dereferencing of revokeObjectURL()
myurl
results in a the user agent acting as if a network error has occurred.
8.3.1. Examples of blob URL Creation and Revocation
Blob URLs are strings that are used to fetch Blob
objects,
and can persist for as long as the document
from which they were minted
using URL.
—createObjectURL()
This section gives sample usage of creation and revocation of blob URLs with explanations.
img
elements [HTML] refer to the same blob URL:
url = URL.createObjectURL(blob); img1.src = url; img2.src = url;
URL.revokeObjectURL()
is explicitly called.
var blobURLref = URL.createObjectURL(file); img1 = new Image(); img2 = new Image(); // Both assignments below work as expected img1.src = blobURLref; img2.src = blobURLref; // ... Following body load // Check if both images have loaded if(img1.complete && img2.complete) { // Ensure that subsequent refs throw an exception URL.revokeObjectURL(blobURLref); } else { msg("Images cannot be previewed!"); // revoke the string-based reference URL.revokeObjectURL(blobURLref); }
The example above allows multiple references to a single blob URL,
and the web developer then revokes the blob URL string after both image objects have been loaded.
While not restricting number of uses of the blob URL offers more flexibility,
it increases the likelihood of leaks;
developers should pair it with a corresponding call to URL.
.revokeObjectURL()
9. Security and Privacy Considerations
This section is informative.
This specification allows web content to read files from the underlying file system,
as well as provides a means for files to be accessed by unique identifiers,
and as such is subject to some security considerations.
This specification also assumes that the primary user interaction is with the <input type="file"/>
element of HTML forms [HTML],
and that all files that are being read by FileReader
objects have first been selected by the user.
Important security considerations include preventing malicious file selection attacks (selection looping),
preventing access to system-sensitive files,
and guarding against modifications of files on disk after a selection has taken place.
-
Preventing selection looping. During file selection, a user may be bombarded with the file picker associated with
<input type="file"/>
(in a "must choose" loop that forces selection before the file picker is dismissed) and a user agent may prevent file access to any selections by making theFileList
object returned be of size 0. -
System-sensitive files (e.g. files in /usr/bin, password files, and other native operating system executables) typically should not be exposed to web content, and should not be accessed via blob URLs. User agents may throw a
SecurityError
exception for synchronous read methods, or return aSecurityError
exception for asynchronous reads.
This section is provisional; more security data may supplement this in subsequent drafts.
10. Requirements and Use Cases
This section covers what the requirements are for this API, as well as illustrates some use cases. This version of the API does not satisfy all use cases; subsequent versions may elect to address these.
-
Once a user has given permission, user agents should provide the ability to read and parse data directly from a local file programmatically.
-
Data should be able to be stored locally so that it is available for later use, which is useful for offline data access for web applications.
A Calendar App. User’s company has a calendar. User wants to sync local events to company calendar, marked as "busy" slots (without leaking personal info). User browses for file and selects it. Thetext/calendar
file is parsed in the browser, allowing the user to merge the files to one calendar view. The user wants to then save the file back to his local calendar file (using "Save As"?). The user can also send the integrated calendar file back to the server calendar store asynchronously. -
User agents should provide the ability to save a local file programmatically given an amount of data and a file name.
Note: While this specification doesn’t provide an explicit API call to trigger downloads, the HTML5 specification has addressed this. The
download
attribute of thea
element initiates a download, saving aFile
with the name specified. The combination of this API and thedownload
attribute ona
elements allows for the creation of files within web applications, and the ability to save them locally.A Spreadsheet App. User interacts with a form, and generates some input. The form then generates a CSV (Comma Separated Variables) output for the user to import into a spreadsheet, and uses "Save...". The generated output can also be directly integrated into a web-based spreadsheet, and uploaded asynchronously. -
User agents should provide a streamlined programmatic ability to send data from a file to a remote server that works more efficiently than form-based uploads today.
-
User agents should provide an API exposed to script that exposes the features above. The user is notified by UI anytime interaction with the file system takes place, giving the user full ability to cancel or abort the transaction. The user is notified of any file selections, and can cancel these. No invocations to these APIs occur silently without user intervention.
Acknowledgements
This specification was originally developed by the SVG Working Group. Many thanks to Mark Baker and Anne van Kesteren for their feedback.
Thanks to Robin Berjon and Jonas Sicking for editing the original specification.
Special thanks to Olli Pettay, Nikunj Mehta, Garrett Smith, Aaron Boodman, Michael Nordman, Jian Li, Dmitry Titov, Ian Hickson, Darin Fisher, Sam Weinig, Adrian Bateman and Julian Reschke.
Thanks to the W3C WebApps WG, and to participants on the public-webapps@w3.org listserv