"युनिकोड" का संशोधनहरू बिचको अन्तर
Content deleted Content added
सा r2.5.2) (रोबोट ले परिवर्तन गर्दै: fa:یونیکد |
सा स्वचालित हिज्जे सम्पादन, replaced: मा → मा (71), हरु → हरू (2) |
||
पङ्क्ति १:
{{Table Unicode}}
'''युनिकोड''' कम्प्युटरको एक अन्तर्राष्ट्रिय गुणस्तर हो जसको उदेश्य मानिसले कम्प्युटरमा भण्डारण गर्न चाहने हरेक दस्तावेजका लिपिलाई [[सङ्केतन]] (encode) गर्ने माध्यम प्रदान गर्नु हो। This includes all [[:en:Writing system|script]]s
The creation of युनिकोड is an ambitious [[:en:project|project]] to replace existing [[:en:character encoding|character set]]s, many of which are
== Origin र development ==
It is the explicit aim of युनिकोड to transcend the limitations of traditional [[:en:character encoding|character encoding]]s such as those defined by the [[:en:ISO 8859|ISO 8859]] standard, which are
युनिकोड's
This simple aim is greatly complicated by another aim, which is to provide lossless conversion amongst different existing
The युनिकोड standard also includes a number of related items, such as character properties, text normalisation forms, र bidirectional display order (for the correct display of text containing both right-to-left scripts, such as [[:en:Arabic|अरबी]] or [[:en:Hebrew language|हेब्रु]], र left-to-right scripts).
In [[1997]] a proposal was made by [[:en:Michael Everson|माइकल इभरसन्]] to encode the characters of the [[Klingon language]]
== Mapping र encodings ==
=== Standard ===
The [[युनिकोड Consortium]],
The Consortium first published ''The युनिकोड Standard'' (ISBN 0-321-18578-1)
==== युनिकोड revision history ====
पङ्क्ति ३९:
=== Storage transfer र processing ===
So far, it has only been said that युनिकोड is a means to assign a unique number for all characters used by [[human]]
The internal logic of much 8-bit legacy software typically permits only 8 bits for each character, making it impossible to use more than 256 code points without special processing, र 16-bit software is limited to some tens of thousands of characters, while युनिकोड is already up to more than 90,000 encoded characters. Several mechanisms have therefore been suggested to implement युनिकोड; which one is chosen depends on available storage space, [[source code]] compatibility, र interoperability with other systems.
The mapping methods are called the UTF (युनिकोड Transformation Format) र UCS (Universal Character Set) encodings. Among them are [[UTF-32]], [[UCS-4]], [[UTF-16]], [[UCS-2]], [[UTF-8]], [[UTF-EBCDIC]] र [[UTF-7]]. The numbers indicate the number of
The युनिकोड [[Byte Order Mark|byte order mark]] (BOM) is specified for use at the beginnings of text
{{See also|Mapping of युनिकोड characters}}
पङ्क्ति ५१:
=== Ready-made vs. composite characters ===
युनिकोड includes a mechanism for modifying character shape र so greatly extending the supported glyph repertoire. This is the use of [[combining diacritical mark]]s. They are inserted after the main character (it is possible to stack several combining diacritics over the same character). However, for reasons of compatibility, युनिकोड also includes a large quantity of [[precomposed character]]s.
The situation with [[Hangul]] is similar. युनिकोड provides the mechanism for composing Hangul syllables with [[Hangul Jamo]]. However, the precomposed Hangul syllables (11,172 of them) are also provided.
The [[CJK]] ideographs currently are encoded
Combining marks, like the complex script shaping required to properly render [[Arabic]] text र many other scripts, are usually dependent on complex font technologies, like [[OpenType]] (by Adobe र [[Microsoft]]), Graphite (by [[SIL International]]), र [[Apple Advanced Typography|AAT]] (by [[Apple Computer|Apple]]), by which a font designer includes
[[As of 2004]], most software still cannot reliably handle many features not supported by older font formats, so combining characters generally will not work correctly. Hypothetically, {{युनिकोड|ḗ}} (precomposed e with macron र acute above) र {{युनिकोड|ḗ}} (e followed by the combining macron above र combining acute above) are
Also underdots, as
:{{युनिकोड|ṃ - ṇ - ḷ}}
Of course, this
=== Issues ===
Some people,
युनिकोड is criticized for failing to allow for older र alternate forms of [[kanji]], which, it is said, complicates the processing of ancient Japanese र uncommon Japanese names, although it follows the recommendations of Japanese scholars of the language र of the Japanese government. There have been several attempts to create an alternative to युनिकोड. [http://www-106.ibm.com/developerworks/unicode/library/u-secret.html] Among them are [[TRON]] (although it is not widely
[[Thai language]] support has been criticized for its illogical ordering of Thai characters. This complication is due to युनिकोड inheriting the [[TIS-620|Thai Industrial Standard 620]], which
==
=== Operating systems ===
पङ्क्ति ८०:
=== E-mail ===
[[MIME]] defines two different mechanisms for encoding non-ASCII
The adoption of
=== Web ===
पङ्क्ति ८८:
Recent web browsers display web pages using युनिकोड if an appropriate [[typeface|font]] is installed (see [[Unicode र HTML]]).
Although syntax rules may affect the
=== Fonts ===
पङ्क्ति ९४:
Free र retail fonts based on युनिकोड are common, since first [[TrueType]] र now [[OpenType]] support युनिकोड. These font formats map युनिकोड code points to glyphs.
There are thousands of fonts on the market, but fewer than a dozen fonts attempt to support the majority of युनिकोड's character repertoire; these fonts are sometimes described as pan-युनिकोड. Instead, युनिकोड based fonts typically focus on supporting only basic ASCII र particular scripts or sets of characters or symbols. There are several reasons for this: applications र documents rarely need to render characters from more than one or two writing systems; fonts tend to be resource
युनिकोड characters which cannot be rendered are most often displayed as an open rectangle only, to indicate the position of the unrecognized character. Some attempts have been made to provide more information about these characters. The Apple ''[[LastResort]]'' font will display a substitute glyph indicating the युनिकोड range of the character र the [[SIL International|SIL]] [[Unicode fallback font]] will display a box showing the hexadecimal scalar value of the character.
पङ्क्ति ११०:
Word 2003 also allows for entering युनिकोड characters by spelling out the code first, e.g. 014B for the 'ng'-symbol र then hitting 'Alt' plus 'X' to substitute the string to the left by its युनिकोड character.
Macintosh users have a similar feature with an input method called 'Unicode Hex Input',
[[GNOME|Gnome2]] follows [[ISO 14755]]. Hold down Ctrl र Shift र enter the hexadecimal युनिकोड value.
The [[Opera (web browser)|Opera web browser]]
== See also ==
पङ्क्ति १२५:
** [http://www.unicode.org/charts/ Code Charts] ([[portable document format|PDF]])
* [http://www.macchiato.com/unicode/charts.html UTF-8, UTF-16, UTF-32 Code Charts] र a [http://www-atm.physics.ox.ac.uk/user/iwi/charmap.html character map] ([[JavaScript]])
* [http://www.eki.ee/letter/ The Letter Database] Uses forms to present
* [http://www.decodeunicode.org/ DecodeUnicode - Unicode Wiki, 50.000 gifs र information about each character]
* [http://www.cl.cam.ac.uk/~mgk25/ucs/examples/ Example text files using Unicode]
* [http://www.lazytools.com/unicode-ascii/ Unicode special character map] is similar to the Windows version. Click a symbol to obtain either the named or numeric code for HTML.
* [[Michael Everson]]'s [http://www.unicode.org/notes/tn4/everson-iuc21pap.pdf "Leaks मा the Unicode pipeline: script, script, script…"] PDF 2MB
* [http://www.evertype.com/standards/csur/ ConScript Unicode Registry] a project to standardize part of the Private Use Area for use with [[artificial script]]s र artificial languages. An explanation of how to propose character
* [http://www-106.ibm.com/developerworks/unicode/library/u-secret.html The secret life of Unicode] "A peek at Unicode's soft underbelly" Describes problems requiring resolution. Includes links to Unicode resources.
* Tim Bray's [http://www.tbray.org/ongoing/When/200x/2003/04/26/UTF Characters vs Bytes] explains how the different encodings work.
* [http://www.joelonsoftware.com/articles/Unicode.html The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode र Character Sets (No Excuses!)] by [[Joel Spolsky]]
* [http://www.alanwood.net/unicode/ Alan Wood's Unicode Resources] Contains lists of word processors with Unicode capability; characters are grouped by type; characters are
* [[Font]]s र tools:
** [http://www.cl.cam.ac.uk/~mgk25/ucs-fonts.html Unicode fonts र tools] for the [[X Window System]]
पङ्क्ति १४२:
** [http://www.icu-project.org/ International Components for Unicode (ICU)] An open source set of libraries that provide robust र full-featured Unicode services for your applications on a wide variety of platforms.
** [http://www.joelonsoftware.com/articles/Unicode.html The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode र Character Sets (No Excuses!)] by [[Joel Spolsky]] of JoelonSoftware.com
** [http://freedesktop.org/wiki/Software_2futf_2d8 Freedesktop.Org’s Project UTF-8]’s purpose is to document र promote proper Unicode
* Seeing [http://www.ianalbert.com/misc/unichart.php the entirety of Unicode printed out] as a single large poster gives a good feel for the size of the code.
|