2010年12月16日

frequency

1998/3/1, posted by Karl Kluge

Here is the full frequency data for the transformed text using the mapping from Tiltman structures to individual characters given above:

Vowels identified by Sukhotin's algorithm: I J K E A L B G 4
Line Word
Letter Global Initial Final Initial Final
0 5.84927 0.52506 0.89153 1.54647 21.70076
1 7.56907 0.90692 2.00594 2.49814 15.81762
2 1.87020 0.23866 0.14859 1.14498 7.03303
3 2.68105 0.71599 0.37147 1.44238 7.83527
4 0.06866 0.04773 0.07429 0.05948 0.25404
5 0.26484 0.33413 0.00000 0.17844 0.45461
6 0.00000 0.00000 0.00000 0.00000 0.00000
7 0.00000 0.00000 0.00000 0.00000 0.00000
8 0.00000 0.00000 0.00000 0.00000 0.00000
9 0.00000 0.00000 0.00000 0.00000 0.00000
A 3.48210 5.25060 1.26300 4.90706 1.60449
B 2.89031 4.72554 1.04012 4.32714 1.55101
C 0.39235 0.14320 0.00000 0.75836 0.13371
D 0.12424 0.04773 0.07429 0.14870 0.05348
E 4.78339 10.54893 0.89153 14.98885 1.36382
F 0.05231 0.00000 0.00000 0.16357 0.01337
G 1.77538 5.34606 0.14859 4.72862 0.82899
H 0.15694 0.14320 0.00000 0.46097 0.04011
I 16.53752 16.84964 3.26895 20.66914 11.49886
J 7.42194 10.73986 0.52006 8.28253 3.65022
K 11.84241 19.18854 13.15007 14.92937 15.48335
L 3.28592 11.31265 6.53789 3.58364 1.29696
M 4.51855 3.15036 7.20654 2.24535 1.93876
N 9.12539 6.49165 11.73848 7.83643 2.68753
O 0.24849 0.00000 2.52600 0.04461 0.05348
P 2.39333 0.28640 6.61218 0.65428 0.45461
Q 5.25421 1.24105 20.20802 1.56134 1.39056
R 0.14059 0.00000 0.74294 0.08922 0.02674
S 3.48210 1.24105 9.21248 1.47212 1.23011
T 0.40543 0.19093 1.26300 0.16357 0.20056
U 0.12424 0.04773 0.14859 0.07435 0.09360
V 0.00000 0.00000 0.00000 0.00000 0.00000
W 3.21726 0.28640 9.88113 1.01115 1.31034
X 0.02616 0.00000 0.00000 0.01487 0.00000
Y 0.01635 0.00000 0.07429 0.01487 0.00000
Z 0.00000 0.00000 0.00000 0.00000 0.00000

Entropy 4.00770 3.47030 3.59539 3.65581 3.46832
- ------------------------------------------------------
Digraphs whose max frequency global, line initial, etc. > 2.500000%:
line word
global initial final initial final wf/wi
0E 1.4085 0.0000 0.0000 0.0000 0.0000 6.2592
0I 0.7781 0.0513 0.0000 0.0000 0.0321 3.3754
0N 0.6196 0.0000 0.0899 0.0000 0.0160 2.6217
1E 0.6880 0.0000 0.0000 0.0000 0.0000 2.9985
1I 0.9726 0.1540 0.0000 0.2967 0.5451 2.7691
1K 1.1131 0.1027 0.1799 0.3894 1.6194 2.5397
E0 0.7745 1.0267 0.1799 3.1337 3.2227 0.0164
E2 0.6268 1.3347 0.0000 2.5032 2.6615 0.0000
I0 2.4063 2.4641 0.4496 3.8569 9.6841 0.0164
I1 3.8940 3.0287 0.7194 5.8780 9.2512 0.2622
II 1.0410 0.8727 0.1799 0.3894 0.4489 2.9330
IK 2.3558 2.1561 1.3489 2.9112 5.0024 2.4742
J0 1.6282 1.1807 0.0899 2.1509 6.5416 0.0328
J1 2.1181 2.1047 0.5396 2.6516 5.0505 0.0983
KI 1.4805 5.1848 0.2698 0.9271 0.9139 2.8347
KJ 0.6952 2.8747 0.0000 0.4636 0.2565 1.2125
KP 0.6664 2.0534 2.7878 0.8159 0.2245 0.0655
KQ 2.4675 4.4661 12.8597 4.3019 0.8498 0.1147
KS 1.0807 1.3860 4.2266 2.0026 0.4008 0.2458
KW 0.9474 0.5133 3.9568 2.0026 0.5451 0.0655
LN 0.4431 2.8747 0.3597 0.3894 0.0802 0.0164
NK 1.7543 0.6160 2.9676 1.0013 2.2607 0.4752

h2 3.57014 3.23198 2.96356 3.27572 2.79567 3.47274

Any suggestions on how to proceed with testing of this hypotheis regarding the nature of the encoding (and, more to the point, finding the correct mappings of Voynich character combinations to plaintext characters if this is the type of cipher we're dealing with)?
posted by ぶらたん at 23:09| Comment(0) | テキストの性質

C89 ratio

1998/2/9, posted by Denis Mardle

my remark that implied Herbal B1 and B2 were the same language/hand despite the pretty fit to the quires from Karl's work which could still be valid since I was only looking at the Currier O89 to C89 ratio. This ratio is very good at sorting out sets. For instance Herbal B1 has 23.7% O89 ( 41 to 132 ) and Herbal B2 is 20.0% ( 67 to 269 ). These figures fit into the range for the Stars B sets f104r,f105r,f106r,f107r ( see my later "quires ... " ) which are in the 16-26% range. The f103r and f108r sets have only 4.7%, closer to the Bio - B figure of only 0.5%, but I suspect significantly different. Herbal A is very different again with the ratio at 98.5% ( 270 to 4 ). My conclusion is that neither Herbal B1 nor B2 can go with Bio - B and the O89 to C89 ratio test does not split them. I will accept another statistic to show a B1 to B2 significance, but I need to see the figures. The O89 to C89 ratio ( at 98.5% ) will not split Herbal A sets.
posted by ぶらたん at 22:10| Comment(0) | テキストの性質
HPへ戻る