» devunicode
This site relies heavily on Javascript. You should enable it if you want the full experience. Learn more.

devunicode

acl(admin devvvv vvvvgroup)

discussion now on this forum

links

for replacing TLabel, TEdit, TMemo see

    http://www.tntware.com/delphicontrols/unicode/downloads.htm http://www.lmdinnovative.com/products/lmdelpack/

another Unicode Library

    http://www.lischke-online.de/UnicodeLibrary.php

BOM option for Clean

A good encoding converter will also offer options for adding or removing the BOM:
Unconditionally prefix the output text with U+FEFF.
Prefix the output text with U+FEFF unless it is already there.
Remove the first character if it is U+FEFF.

Encode

nodes for coding various 7bit and 8bit encodings to unicode and back. this allows for some historical perspectives, and solves most mac/unix/pc issues with text files.

note that most of the needed code below is already implemented in indy and the Open XML Utility Library

Desirable Encodings

  • ASCII-1963 X3.4 (7bit) seehttp://www.wps.com/projects/code
  • ASCII-1967 (7bit, in UnicodeConv.pas)
  • ISO 646 national variants (7bit)
  • ISO 8859 1-15 (8bit, in UnicodeConv.pas)
  • MAC 10000-10081 (8bit, in UnicodeConv.pas)
  • KOI8_R (8bit, russian, in UnicodeConv.pas)
  • JIS_X0201
  • Various MSDOS Codepages (8bit, in UnicodeConv.pas) including IBMPC,EBCDIC seehttp://www.sferyx.com/htmleditor/supportedencodings.htm
  • NextStep
  • Petsci
  • UTF-8 (see also JCL)
  • UTF-8 / BOM
  • UTF-16 BE
  • UTF-16 BE / BOM
  • UTF-16 SE
  • UTF-16 SE / BOM
  • UTF-7
  • XML
  • TikiWiki (see also kalle patches)

Pins

  • DEFAULT_CHAR for all nonrepresentable chars

Linefeed converter node (unabhängig von utf-8 und unicode)

  • CP/M, Microsoft DOS and Windows benutzen die aus den Zeiten der Fernschreiber gewohnte Folge 0D 0A (CR LF);
  • Apple bzw. Mac nutzen 0D (CR).
  • Unter UNIX und LINUX wird der Standardumbruch 0A (LF) benutzt.

Encoding links

Software Defect Patterns which Break Text Integrity

In the world of Internationalization software engineering, one of the most common defect behavior patterns is Garbage Text or Garbled Text. Sometimes, people also refer to it as Mojibake, which is a transliteration of a Japanese term that means garbage. After we take a closer look at different Garbage Text, we observe different kinds of defect behavior sub-patterns:http://people.netscape.com/ftang/paper/unicode25/a302_v1.htm

utf8 --> RFC 3629

http://www.cl.cam.ac.uk/~mgk25/unicode.htmlhttp://www.cl.cam.ac.uk/~mgk25/ucs/utf-8-history.txthttp://acspro.atari.org/KeyTab/Normal/006024.html
async pro -> Adxbase.pas

Ebcdic

http://www.legacyj.com/cobol/ebcdic.html

Petsci

http://www.df.lth.se/~triad/krad/recode/petscii.html

UTF-7

http://www.zeitungsjunge.de/delphi/unicode/ kann auch utf7 http://www.faqs.org/rfcs/rfc2152.htmlhttp://acspro.atari.org/KeyTab/Normal/006027.html

ISO 646-DE etc.

http://www.soziologie.uni-halle.de/unger/scripts/workshop_internet/ref_char_646.htmlhttp://www.ecma-international.org/publications/files/ECMA-ST/Ecma-006.pdf

Iso8859

  • Iso8859-1 Latin1 (West European)
  • Iso8859-2 Latin2 (East European)
  • Iso8859-3 Latin3 (South European)
  • Iso8859-4 Latin4 (North European)
  • Iso8859-5 Cyrillic
  • Iso8859-6 Arabic
  • Iso8859-7 Greek
  • Iso8859-8 Hebrew
  • Iso8859-9 Latin5 (Turkish)
  • Iso8859-10 Latin6 (Nordic)
  • Iso8859-13 Latin7 Baltic Rim
  • Iso8859-14 Latin8 Gaelic and Welsh
  • Iso8859-15 Latin9 replacing the less needed symbols ¦¨´¸¼½¾ with forgotten French and Finnish letters and added euro

function US_ASCIIToUTF16Str(const S: string): WideString;
function Iso8859_1ToUTF16Str(const S: string): WideString;
function Iso8859_2ToUTF16Str(const S: string): WideString;
function Iso8859_3ToUTF16Str(const S: string): WideString;
function Iso8859_4ToUTF16Str(const S: string): WideString;
function Iso8859_5ToUTF16Str(const S: string): WideString;
function Iso8859_6ToUTF16Str(const S: string): WideString;
function Iso8859_7ToUTF16Str(const S: string): WideString;
function Iso8859_8ToUTF16Str(const S: string): WideString;
function Iso8859_9ToUTF16Str(const S: string): WideString;
function Iso8859_10ToUTF16Str(const S: string): WideString;
function Iso8859_13ToUTF16Str(const S: string): WideString;
function Iso8859_14ToUTF16Str(const S: string): WideString;
function Iso8859_15ToUTF16Str(const S: string): WideString;

function KOI8_RToUTF16Str(const S: string): WideString; russian
function JIS_X0201ToUTF16Str(const S: string): WideString;
function nextStepToUTF16Str(const S: string): WideString;

function cp10000_MacRomanToUTF16Str(const S: string): WideString;
function cp10006_MacGreekToUTF16Str(const S: string): WideString;
function cp10007_MacCyrillicToUTF16Str(const S: string): WideString;
function cp10029_MacLatin2ToUTF16Str(const S: string): WideString;
function cp10079_MacIcelandicToUTF16Str(const S: string): WideString;
function cp10081_MacTurkishToUTF16Str(const S: string): WideString;

function cp037ToUTF16Str(const S: string): WideString; // ebcdic-cp-us
function cp424ToUTF16Str(const S: string): WideString; // x-EBCDIC-Hebrew
function cp437ToUTF16Str(const S: string): WideString; // original IBMPC with box chars
function cp437_DOSLatinUSToUTF16Str(const S: string): WideString;
function cp500ToUTF16Str(const S: string): WideString; // EBCDIC 500V1
function cp737_DOSGreekToUTF16Str(const S: string): WideString; // PC Greek
function cp775_DOSBaltRimToUTF16Str(const S: string): WideString; // PC Baltic
function cp850ToUTF16Str(const S: string): WideString;// MS-DOS Latin-1
function cp850_DOSLatin1ToUTF16Str(const S: string): WideString;
function cp852ToUTF16Str(const S: string): WideString; // MS-DOS Latin-2
function cp852_DOSLatin2ToUTF16Str(const S: string): WideString;
function cp855ToUTF16Str(const S: string): WideString; // EBCDIC-cyrillic
function cp855_DOSCyrillicToUTF16Str(const S: string): WideString;
function cp856_Hebrew_PCToUTF16Str(const S: string): WideString;
function cp857ToUTF16Str(const S: string): WideString; // IBM Turkish
function cp857_DOSTurkishToUTF16Str(const S: string): WideString;
function cp860ToUTF16Str(const S: string): WideString; // MS-DOS Portuguese
function cp860_DOSPortugueseToUTF16Str(const S: string): WideString;
function cp861ToUTF16Str(const S: string): WideString;
function cp861_DOSIcelandicToUTF16Str(const S: string): WideString;
function cp862ToUTF16Str(const S: string): WideString;
function cp862_DOSHebrewToUTF16Str(const S: string): WideString;
function cp863ToUTF16Str(const S: string): WideString;
function cp863_DOSCanadaFToUTF16Str(const S: string): WideString;
function cp864ToUTF16Str(const S: string): WideString;
function cp864_DOSArabicToUTF16Str(const S: string): WideString;
function cp865ToUTF16Str(const S: string): WideString;
function cp865_DOSNordicToUTF16Str(const S: string): WideString;
function cp866ToUTF16Str(const S: string): WideString;
function cp866_DOSCyrillicRussianToUTF16Str(const S: string): WideString;
function cp869ToUTF16Str(const S: string): WideString;
function cp869_DOSGreek2ToUTF16Str(const S: string): WideString;

function cp874ToUTF16Str(const S: string): WideString; EBCDIC-Thai
function cp875ToUTF16Str(const S: string): WideString;
function cp932ToUTF16Str(const S: string): WideString;
function cp936ToUTF16Str(const S: string): WideString;
function cp949ToUTF16Str(const S: string): WideString;
function cp950ToUTF16Str(const S: string): WideString;
function cp1006ToUTF16Str(const S: string): WideString;
function cp1026ToUTF16Str(const S: string): WideString;
function cp1250ToUTF16Str(const S: string): WideString;
function cp1251ToUTF16Str(const S: string): WideString;
function cp1252ToUTF16Str(const S: string): WideString;
function cp1253ToUTF16Str(const S: string): WideString;
function cp1254ToUTF16Str(const S: string): WideString;
function cp1255ToUTF16Str(const S: string): WideString;
function cp1256ToUTF16Str(const S: string): WideString;
function cp1257ToUTF16Str(const S: string): WideString;
function cp1258ToUTF16Str(const S: string): WideString;

function UTF8ToUTF16BEStr(const S: string): WideString;

anonymous user login

Shoutbox

~2d ago

joreg: 6 session beginner course part 2 "Deep Dive" starts January 13th: https://thenodeinstitute.org/courses/ws24-5-vvvv-beginners-part-ii/

~2d ago

joreg: 6 session beginner course part 1 "Playground" starts November 4th: https://thenodeinstitute.org/courses/ws24-5-vvvv-beginners-part-i/

~2d ago

joreg: Save the date: Oktober 17: vvvv meetup in Berlin!

~4d ago

joreg: 12 session online vvvv beginner course postponed to start November 4th: https://thenodeinstitute.org/courses/ws24-5-vvvv-beginners-class/

~15d ago

~24d ago

joreg: Webinar on October 2nd: Rhino meets Realtime with vvvv https://visualprogramming.net/blog/2024/webinar-rhino-meets-realtime-with-vvvv/

~29d ago

joreg: Introducing: Support for latest Ultraleap hand-tracking devices: https://visualprogramming.net/blog/2024/introducing-support-for-new-ultraleap-devices/

~1mth ago

joreg: 2 day vvvv/fuse workshop in Vienna as part of NOISE festival on Sept. 13 and 14: https://www.noise.ist/vienna

~1mth ago

joreg: New beginner video tutorial: World Cities https://youtu.be/ymzrK7tZLBI

~1mth ago

catweasel: https://colour-burst.com/2023/01/26/macroscopic/ yeah, ' is there anyone who cares about slides anymore...' Well me for a start! :D