2010年12月25日

意味がある vs 意味がない

2000/1/23, posted by Rene Zandbergen

In order to have real, strong evidence that the VMs contains meaningful text, we need to know how one can create a 'meaningless' text that still exhibits the same properties as meaningful text. More to the point: we need to find a mechanism that could have been applied 400-500 years ago.

Jacques already pointed out that we don't actually know how to define meaningful and meaningless. This may well prove to be a serious problem. When trying to generate meaningless texts which the LSC would classify as mneaningful, or vice versa, we're likely to end up in the no-man's land bordering on the two.... Take a meaningful text and start removing words (every 10th, every 2nd, at random...). When does the text stop being meaningful? How does the LSC curve behave?

2000/1/24, posted by Jorge Stolfi

Consider that an ideal text compression algorithm should take "typical" texts and turn them into random-looking strings of bits. Of course this transformation preserves meaning (as long as one has the decompression algorithm!); but, for maximum compression, the program should equalize the bit probabilities and remove any correlations. Modern compressors like PKZIP go a long way in that direction. The compressed text, being shorter than the original, will actually have more meaning per unit length; but it will look like perfect gibberish to LSC-like tests.

Or, consider a meaningful plaintext XORed with the binary expansion of pi. The result will have uniform bit probabilities, and no visible correlations; but it will still carry the original meaning, which can be easily recovered. It would take a very sophisticated algorithm (one that knows that pi is a "special" number) to notice that the text is not an entirely random string of bits.

So the LSC and possible variants are not tests of `meaning' but rather of `naturalness.' They work because natural language uses its medium rather inefficiently, but in a rather peculiar way: it uses symbols with unequal frequencies (a feature that mechanical monkeys can imitate), but changes those frequencies over long distances (something which simple monkeys won't do).

However, with slightly smarter monkeys one *can* generate meaningless texts that fool the LSC; and the same applies for any "meaning detector" that looks only at the message. Conversely, one can always encode a meaninful text so as to make it look "random" to the LSC. In short, a naturally produced (and natural-looking) text can be quite meaningless, while a meaningful text may be (and look) quite unnatural.
posted by ぶらたん at 10:54| Comment(0) | その他
この記事へのコメント
コメントを書く
お名前:

メールアドレス:

ホームページアドレス:

コメント: [必須入力]

認証コード: [必須入力]


※画像の中の文字を半角で入力してください。
HPへ戻る