"युनिकोड" का संशोधनहरू बिचको अन्तर

सा robot Adding: mr:युनिकोड
सा robot Modifying: ckb:یوونیکۆد; अंगराग परिवर्तन
पङ्क्ति २४:
The [[युनिकोड Consortium]], based मा [[California]], is the organization that develops the युनिकोड standard. It is an organization open to any company or individual willing to pay the membership dues. Members include virtually all of the main कम्प्युटर software र hardware companies with any interest मा text processing standards, such as [[Apple Computer]], [[Microsoft]], [[International Business Machines|IBM]], [[Xerox]], [[Hewlett-Packard|HP]], [[Adobe Systems]] र many others.
 
The Consortium first published ''The युनिकोड Standard'' (ISBN 03211857810-321-18578-1) मा [[1991]], र continues to develop standards based on that original work. युनिकोड was developed मा conjunction with the [[ISO|International Organization for Standardization]] र it shares its character repertoire with [[ISO/IEC 10646]]. युनिकोड र ISO/IEC 10646 are equivalent as character encodings, but ''The युनिकोड Standard'' contains much more information for implementers, covering, मा depth, topics such as bitwise encoding, [[collation]], र rendering, र enumerating a multitude of character properties, including those needed for [[BiDi]] support. The two standards also have slightly different terminology.
 
==== युनिकोड revision history ====
पङ्क्ति ४६:
The mapping methods are called the UTF (युनिकोड Transformation Format) र UCS (Universal Character Set) encodings. Among them are [[UTF-32]], [[UCS-4]], [[UTF-16]], [[UCS-2]], [[UTF-8]], [[UTF-EBCDIC]] र [[UTF-7]]. The numbers indicate the number of bits मा one unit, for UTF encodings, or bytes, for UCS encodings. In UTF-32 or UCS-4, one unit is enough for any character; मा the other cases, a variable number of units is used for each character. UTF-8 is the de-facto standard encoding for interchange of युनिकोड text with UTF-16 र UTF-32 being used mainly for internal processing.
 
The युनिकोड [[Byte Order Mark|byte order mark]] (BOM) is specified for use at the beginnings of text files मा UCS-2 र UTF-16 encodings. It has been adopted by some software developers for other encodings, including UTF-8, which does not need an indication of byte order. In this case it is an attempt to mark the file as containing युनिकोड text. The BOM is code point <code>U+FEFF</code>, which has the important property of being unambiguously interpretable regardless of which युनिकोड encoding is used. The units <code>FE</code> र <code>FF</code> never appear मा [[UTF-8]], <code>U+FFFE</code> (the result of byte-swapping <code>U+FEFF</code>) is not a legal character, र <code>U+FEFF</code> is the Zero-Width No-Break Space (a character with no appearance र no effect other than preventing formation of [[ligature (typography)|ligatureligatures]]s). The same character converted to UTF-8 becomes the byte sequence <code>EF BB BF</code>.
 
See also: [[Mapping of युनिकोड characters]]
पङ्क्ति ६०:
Combining marks, like the complex script shaping required to properly render [[Arabic]] text र many other scripts, are usually dependent on complex font technologies, like [[OpenType]] (by Adobe र [[Microsoft]]), Graphite (by [[SIL International]]), र [[Apple Advanced Typography|AAT]] (by [[Apple Computer|Apple]]), by which a font designer includes instructions मा a font telling software how to properly output different character sequences. Another method sometimes employed मा [[fixed-width]] fonts is to place the combining mark's glyph before its own left [[sidebearing]]; this method, however, only works for some diacritics र stacking will not occur properly.
 
[[As of 2004]], most software still cannot reliably handle many features not supported by older font formats, so combining characters generally will not work correctly. Hypothetically, {{युनिकोड|&#7703;}} (precomposed e with macron र acute above) र {{युनिकोड|e&#772;&#769;}} (e followed by the combining macron above र combining acute above) are identical मा appearance, both giving an [[e]] with [[macron]] र [[acute accent]], but appearance can vary greatly across software applications.
 
Also underdots, as needed मा Indic [[Romanization]], will often be placed incorrectly or worse. Sample:
:{{युनिकोड|m&#803; - n&#803; - l&#803;}}
Of course, this is मा fact not a weakness मा युनिकोड itself, but only uncovers gaps मा rendering technology र fonts.
 
पङ्क्ति ९०:
Recent web browsers display web pages using युनिकोड if an appropriate [[typeface|font]] is installed (see [[Unicode र HTML]]).
 
Although syntax rules may affect the order मा which characters are allowed to appear, both HTML 4.0 र XML 1.0 documents are, by definition, comprised of characters from the entire range of युनिकोड code points, minus only a handful of disallowed control characters र the permanently-unassigned code points D800D800-DFFF, any code point ending मा FFFE or FFFF र any code point above 10FFFF10FFFF. These characters manifest either directly as [[byte]]s according to document's encoding, if the encoding supports them, or they may be written as numeric character references based on the character's युनिकोड code point, as long as the document's encoding supports the digits र symbols required to write the references (all encodings approved for use on the Internet do). For example, the references <code>&amp;#916;</code> <code>&amp;#1049;</code> <code>&amp;#1511;</code> <code>&amp;#1605;</code> <code>&amp;#3671;</code> <code>&amp;#12354;</code> <code>&amp;#21494;</code> <code>&amp;#33865;</code> <code>&amp;#45307;</code> (or the same numeric values expressed मा hexadecimal, with <code>&amp;#x</code> as the prefix) display on your browser as &#916;Δ, &#1049;Й, &#1511;ק, &#1605;م, &#3671;, &#12354;, &#21494;, &#33865;&#45307;—if냻—if you have the proper fonts, these symbols look like the [[Greek alphabet|Greek]] capital letter "Delta", [[Cyrillic alphabet|Cyrillic]] capital letter "Short I", [[Arabic alphabet|Arabic]] letter "Meem", [[Hebrew alphabet|Hebrew]] letter "Qof", [[Thai language|Thai]] [[numeral]] 7, [[Japanese language|Japanese]] [[Hiragana]] "A", [[simplified Chinese]] "[[Leaf]]", [[traditional Chinese]] "Leaf", र [[Korean language|Korean]] [[Hangul]] syllable "Nyaelh", respectively.
 
=== Fonts ===
पङ्क्ति १०१:
 
=== Multilingual Text Rendering Engines ===
* [[Uniscribe]] - [[Microsoft Windows|Windows]]
* [[Apple Type Services for Unicode Imaging]] - new engine for [[Apple Macintosh|Macintosh]]
* [[WorldScript]] - old engine for [[Apple Macintosh|Macintosh]]
* [[Pango]] - [[open source]]
* [[Graphite (Renderer)|Graphite]] - (open source renderer from [[SIL International|SIL]])
 
=== Input methods ===
On [[Windows XP]], any युनिकोड character can be input by pressing Alt, then, with Alt down (and using only the numeric keypad keys), pressing the [[decimal]] digits of the युनिकोड characters one after the other. For example, Alt, then, with Alt still down, 9, then 6 र then 0 yields &#960;π (Greek lowercase letter Pi). For values less than 256, precede the digits with a 0, to avoid code page translation (see [[Extended ASCII]]), e.g. Alt 0, 1, 6, 5 yields ¥.
 
Word 2003 also allows for entering युनिकोड characters by spelling out the code first, e.g. 014B for the 'ng'-symbol र then hitting 'Alt' plus 'X' to substitute the string to the left by its युनिकोड character.
पङ्क्ति १३०:
* [http://www.eki.ee/letter/ The Letter Database] Uses forms to present groups मा list or grid format by [[hexadecimal]].
* [http://www.decodeunicode.org/ DecodeUnicode - Unicode Wiki, 50.000 gifs र information about each character]
* [http://www.cl.cam.ac.uk/~mgk25/ucs/examples/ Example text files using Unicode]
* [http://www.lazytools.com/unicode-ascii/ Unicode special character map] is similar to the Windows version. Click a symbol to obtain either the named or numeric code for HTML.
* [[Michael Everson]]'s [http://www.unicode.org/notes/tn4/everson-iuc21pap.pdf "Leaks मा the Unicode pipeline: script, script, script…"] PDF 2MB
* [http://www.evertype.com/standards/csur/ ConScript Unicode Registry] a project to standardize part of the Private Use Area for use with [[artificial script]]s र artificial languages. An explanation of how to propose character names मा Unicode is available here.
* [http://www-106.ibm.com/developerworks/unicode/library/u-secret.html The secret life of Unicode] "A peek at Unicode's soft underbelly" Describes problems requiring resolution. Includes links to Unicode resources.
* Tim Bray's [http://www.tbray.org/ongoing/When/200x/2003/04/26/UTF Characters vs Bytes] explains how the different encodings work.
* [http://www.joelonsoftware.com/articles/Unicode.html The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode र Character Sets (No Excuses!)] by [[Joel Spolsky]]
* [http://www.alanwood.net/unicode/ Alan Wood's Unicode Resources] Contains lists of word processors with Unicode capability; characters are grouped by type; characters are presented मा lists, not grids.
* [[Font]]s र tools:
** [http://www.cl.cam.ac.uk/~mgk25/ucs-fonts.html Unicode fonts र tools] for the [[X Window System]]
** Unicode TTF fonts: [[Arial Unicode MS]], [[Code2000]]: [http://home.att.net/~jameskass/ license info र download link], [[Junicode]]: [http://www.engl.virginia.edu/OE/junicode/junicode.html license info र download link], [[Titus Cyberbit Basic]]: [http://titus.uni-frankfurt.de/indexe.htm?/unicode/unitest2.htm license info] & [http://titus.fkidg1.uni-frankfurt.de/unicode/tituut.asp download link]
** [http://earthlingsoft.net/UnicodeChecker/ UnicodeChecker], a Unicode character browser for [[Mac OS X]]
* [[Software engineering]]:
** [http://www.icu-project.org/ International Components for Unicode (ICU)] An open source set of libraries that provide robust र full-featured Unicode services for your applications on a wide variety of platforms.
** [http://www.joelonsoftware.com/articles/Unicode.html The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode र Character Sets (No Excuses!)] by [[Joel Spolsky]] of JoelonSoftware.com
पङ्क्ति १५०:
{{SpecialChars}}
 
[[categoryCategory:कम्प्युटर]]
[[categoryCategory:सफ्ट्वेर]]
 
[[als:Unicode]]
पङ्क्ति १६२:
[[ca:Unicode]]
[[chr:Unicode/Cherokee]]
[[ckb:یونیکۆدیوونیکۆد]]
[[cs:Unicode]]
[[da:Unicode]]
"https://ne.wikipedia.org/wiki/युनिकोड" बाट अनुप्रेषित