- published: 20 Sep 2013
- views: 508590
UTF-8 is a character encoding capable of encoding all possible characters, or code points, in Unicode.
The encoding is variable-length and uses 8-bit code units. It was designed for backward compatibility with ASCII, and to avoid the complications of endianness and byte order marks in the alternative UTF-16 and UTF-32 encodings. The name is derived from: Universal Coded Character Set + Transformation Format – 8-bit.
UTF-8 is the dominant character encoding for the World Wide Web, accounting for 86.2% of all Web pages in January 2016 (with the most popular East Asian encoding, GB 2312, at 0.9% and Shift JIS at 1.1%). The Internet Mail Consortium (IMC) recommends that all e-mail programs be able to display and create mail using UTF-8, and the W3C recommends UTF-8 as the default encoding in XML and HTML.
UTF-8 encodes each of the 1,112,064 valid code points in the Unicode code space (1,114,112 code points minus 2,048 surrogate code points) using one to four 8-bit bytes (a group of 8 bits is known as an octet in the Unicode Standard). Code points with lower numerical values (i.e., earlier code positions in the Unicode character set, which tend to occur more frequently) are encoded using fewer bytes. The first 128 characters of Unicode, which correspond one-to-one with ASCII, are encoded using a single octet with the same binary value as ASCII, making valid ASCII text valid UTF-8-encoded Unicode as well. And ASCII bytes do not occur when encoding non-ASCII code points into UTF-8, making UTF-8 safe to use within most programming and document languages that interpret certain ASCII characters in a special way, e.g. as end of string.
Audible free book: http://www.audible.com/computerphile Representing symbols, characters and letters that are used worldwide is no mean feat, but unicode managed it - how? Tom Scott explains how the web has settled on a standard. More from Tom Scott: http://www.youtube.com/user/enyay and https://twitter.com/tomscott EXTRA BITS: http://youtu.be/qBex3IDaUbU Data Security: http://youtu.be/4SSSMi4X_mA http://www.facebook.com/computerphile https://twitter.com/computer_phile This video was filmed and edited by Sean Riley. Computerphile is a sister project to Brady Haran's Numberphile. See the full list of Brady's video projects at: http://bit.ly/bradychannels
This video gives an introduction to UTF-8 and Unicode. It gives a detail description of UTF-8 and how to encode in UTF-8. This is a video presentation of the article "How about Unicode and UTF-8" which was published on www.gamedev.net. Writing an STL-Style UTF-8 String Class - http://squaredprogramming.blogspot.com/2013/12/writing-stl-style-utf-8-string-class.html How about Unicode and UTF-8 - http://www.gamedev.net/page/resources/_/technical/general-programming/how-about-unicode-and-utf-8-r3322 www.squaredprogramming.com
This tutorial explains the utf-8 way of representing characters in a computer; later generalizing (high level) how any kind of data can be represented in a computer.
UTF8 is fantastic, but people still have translation issues with some characters - Tom explains why. More from Tom Scott: http://www.youtube.com/user/enyay http://www.facebook.com/computerphile https://twitter.com/computer_phile This video was filmed and edited by Sean Riley. Computerphile is a sister project to Brady Haran's Numberphile. See the full list of Brady's video projects at: http://bit.ly/bradychannels
music by:
UTF-8 and UTF-16 are different encodings for the Unicode character set. Let's discuss UTF-8 first. UTF-8 is what is known as a variable-length character set. This means that the amount of storage a character takes up depends on what character it is. For example, if we store the character A, it will only take up one byte. In fact, ASCII is a subset of UTF-8. That means UTF-8 encoding can work with ASCII data. If you are new to computer storage, a byte is a very small amount of information. The smallest thing a computer can store is a bit. 1 or 0. On or off. There are 8 bits in a byte, 1024 bytes in a kilobyte, 1024 kilobytes in a megabyte, 1024 megabytes in a gigabyte, and 1024 gigabytes in a terabyte, and 1024 terabytes in a petabyte. Considering it is completely possible fo...
Aprendendo LaTeX - Aula 03 - Fonte Arial 12 e UTF8 Nesta terceira aula vamos alterar a fonte do documento e o tamanho, usando comandos bem simples. Vou te mostrar, também, como não errar, e manter seus documentos com a acentuação em dia! Canal Oficial do Curso Superior de Tecnologia em Análise e Desenvolvimento de Sistemas da FAETERJ-Rio.
HELP ME! http://www.patreon.com/calebcurry Subscribe to my newsletter: http://eepurl.com/-8qtH Donate!: http://bit.ly/DonateCTVM2. ~~~~~~~~~~~~~~~Additional Links~~~~~~~~~~~~~~~ More content: http://CalebCurry.com Facebook: http://www.facebook.com/CalebTheVideoMaker Google+: https://plus.google.com/+CalebTheVideoMaker2 Twitter: http://twitter.com/calebCurry
She wiped the smile right off my face
And hid it away in a secret place
The night was dark and the ground was cold
I slipped myself into a pool
I saw the trees but not the wood
And floated in an icey flood
As cold began to freeze my heart
I heared a voice come through the dark
Bring up the coals
Light up the fire
Joy de viva
Joy de viva
Smile your shining smile on me
If you see her
Say I need her
Joy de viva
Joy de viva
Now sunburned men tell tales of me
Of how I sail the ocean deep
Upon the brow I shade my face
Searching for that sate of grace
Every night the moon appears
She shows me that I need not fear
The crashing rocks and siren wind
And I will find her in the end
Then I will run