Rss Feed
Tweeter button
Technorati button
Reddit button
Myspace button
Linkedin button
Webonews button
Delicious button
Digg button
Stumbleupon button
Newsvine button

Script Challenge: can you figure this out?

I’ve decided that each weekend I’ll dig out an object or two from my more distant past and write about it. To kick things off, here’s a challenge which was originally created by the same chap who coined my name.

The text you can see in the image below (at least if you happen to be sighted) is in an unknown script. Your task is obvious, I think. The only clues you have are that it’s a quote from a book by Ursula LeGuin and it’s nothing whatsoever to do with Tolkein.

Image of text in an unknown alphabet

Now originally I solved this in under 2 days, without the aid of computers or amphetamines. I reckon that in The Age of the Internet you can do better. I’ll negotiate a suitable prize for the first person who posts the solution.

5 Random Semi-Related Posts

Tags: ,

  1. Stephen Stockwell’s avatar

    It’s been 14 months, Stil. No takers.

    That tells me that the other 99.96-ish per cent of the population (like me) who aren’t cryptography geeks, will need the help of our protocol droids if we’re to solve this without shelving all our commitments for a month or whatever.

    Reply

  2. Stilgherrian’s avatar

    Stephen Stockwell: I find looking at the preview of your post and correcting problems before posting to be invaluable. :) OTOH, I could register you so you can edit your comments? Editing the past is quite popular now that everything is digital.

    As far as cryptanalysis goes, this puzzle is real beginner-grade material. I’m quite surprised no-one’s even had a go.

    Reply

  3. Stephen Stockwell’s avatar

    Cryptanalysis personally = Could. Not. Be. Arsed. Ball-tearer of an interesting subject, though.

    Reply

  4. Bob Bain’s avatar

    I’ve had a brief look at this following your tweet about nobody solving it.

    Observation 1. It could be a simple character substitution code given that at least two “scribbles” are repeated throughout the script – the first “scribble” and last “scribble” on the first line for instance. If this is the case the commonly used English character frequency of letters table starting ETANOISH… could be of use.

    Observation 2. Although you state it has absolutely nothing to do with “Tolkien” clearly this is a clue in itself and research into Tolkien is clearly in order. I note the the “Tolkien logo” bears a resemblance to the strokes (scribbles) in the script.

    Observation 3. If this is a character substitution code and if each line is a word then the second line could only be the “word” a or i.

    Observation 4. If the second character is an a or an i and if each line is a word then the penultimate line would be a two character word that could only start with an a e i o or u. If the second letter is an a or an i this reduces the number of permutations to words such as it or of etc.

    Observation 5. I had never heard of Ursula LeGuin but Google research into “quotes Ursula LeGuin” results in quite a few interesting observations about life the universe and everything but none quite match my deliberations although “if you light a candle you also cast a shadow” has an appropriate place given that it has an “a” where I might expect to find an “a”.

    Reply

  5. Stilgherrian’s avatar

    @Bob Bain: Your approach to cryptanalysis is taking the right path, mostly. My Tolkien comment was originally because his Tengwar script, also known as Feanorian letters, is the one which appears most frequently in his work — and the first wrong path which some people have taken when trying to solve the puzzle.

    Sample of Tolkien's Tengwar script: Ennyn Durin Aran Moria: pedo mellon a minno. (The Doors of Durin, Lord of Moria. Speak, friend, and enter.)

    This sample reads “Ennyn Durin Aran Moria: pedo mellon a minno”, which is Sindarin, an elvish language, for “The Doors of Durin, Lord of Moria. Speak, friend, and enter.” I could once read and write those letters, some decades ago, but I never went as far as to learn the languages themselves. I could pronounce the words, but not understand them.

    That LeGuin quote you found is indeed beautiful, but it is not the one. The full context, from the first of her Earthsea trilogy, A Wizard of Earthsea, is some advice to the young wizard Sparrowhawk:

    You must not change one thing, one pebble, one grain of sand, until you know what good and evil will follow on that act. The world is in balance, in Equilibrium. A wizard’s power of Changing and Summoning can shake the balance of the world. It is dangerous, that power. It is most perilous. It must follow knowledge, and serve need. To light a candle is to cast a shadow.

    Now as I say, Bob, your approach is mostly the right path. The language is English, just written in a different alphabet. A letter-frequency attack is the right first step for a letter-substitution cypher. But this is not strictly a letter-substitution cypher.

    One more clue: Not all alphabets have the same number of letters. Why is that?

    Reply

  6. Bob Bain’s avatar

    Right the greek alpahbet has 24 letters alpha beta though to omega

    http://en.wikipedia.org/wiki/Greek_alphabet

    The Phoenician alphabet (linked to above) offers some promise but there is no clear relation between the symbols in either the Greek or Phoenician alphabets to the script above.

    Wikipedia informs me in their article on Egyptian hieroglyphs that there are logographic and alphabetic representations. The Japanese I seem to recall tend to mix their logographic symbols with English and/or European characters which makes travelling in Tokyo quite interesting as the English bits give an idea of what’s going on.

    (working on it.. working on it… It’ll take longer than two days !)

    Reply

  7. Jason Langenauer’s avatar

    It seems to me that some of the letters have an inordinate amount of strokes. So, I’ll offer the hypothesis that the number of strokes and dots in each letter has some relationship to the plain-text. With the number of dots in parentheses, my count of the strokes are

    4(2) 5(2) 13(1) 4(0) 4(2) 6(2) 12(2) 4(0)
    14(3)
    9(4) 12(0) 4(2) 9(1) 14(0)
    6(0) 2(1) 17(2) 4(0) 4(2) 19(1) 4(0)
    12(3)
    3(3) 4(2) 12(0) 4(2) 5(1) 5(1) 10(2)

    Now, this all looks good up until the last line, where two sets of different letters have the same number of strokes and dots – which may be intentional, as a coding device to represent an English (or more correctly, Latin) letter with two or more coded letters, or it may not be, in which case the substitution breaks.

    Perhaps there is meaning as to whether the dot is above or below the central horizontal axis of the script, which would allow the two 5(1)’s to be different letters, but the similarity of the two 4(2)’s on the final line would tend to disprove the hypothesis.

    That’s about as far as my reasoning takes me, pre-coffee on a Sunday morning.

    Reply

  8. Stilgherrian’s avatar

    @Jason Langenauer: Welcome to the Game, Sir! Now, what is a letter and what is a word? Inordinate numbers indeed — or perhaps not.

    Reply

  9. Jason Langenauer’s avatar

    I have often found it useful in many problem domains to list out the assumptions one is making, so that they may be validated. So, what are the assumptions in play here? And how can we validate them?

    1. That each grapheme in the script is separated from other graphemes by a continuous section of space. This means, that the fifth line is one character, not two as @bob_bain has suggested. Can we prove or disprove this? Not really, so lets keep it as a working assumption.

    2. That there is a one-to-one correspondence between a grapheme in the script, and a letter in the Latin alphabet – i.e. each grapheme maps to exactly one Latin letter, and each Latin letter maps to exactly one grapheme. Can we prove or disprove this? Yes, because Stilgherrian confirmed it when he said the “The language is English, just written in a different alphabet”.

    3. That the different lines represent new words – i.e, based on the first assumption, the English quote is of the form “xxxxxxxx x xxxxx xxxxxxx x xxxxxxx”. Hmm, that’s a problem that could be solved rather quickly with a Regex and the text of Ursula LeGuin’s works. But that would hardly be sporting. Can we prove or disprove this? Not really, but it’s worthwhile to keep in mind other possibilities: that no word breaks are used, ASENGLISHCANBEWRITTENLIKETHISWITHOUTLOSINGMUCHMEANING, or another grapheme is used as a “word-break” character – the last character of the first line is a possibility.

    More thought required here…. but at least I learnt a new word (“grapheme”).

    Reply

  10. Stilgherrian’s avatar

    @Jason Langenauer: “Grapheme” is indeed a wond’rous word. “Alphabet” is an interesting word too, and one which deserves further consideration.

    Reply

  11. Bob Bain’s avatar

    Ah hah

    Currently considering FONTS !

    e.g.

    http://www.searchfreefonts.com/free/bisaya-1880.htm

    Type Stilgherrian into the box and you get symbols with dots and lines BUT if the solution is a FONT then this isn’t the font that may be being used as it doesn’t produce the required graphemes.

    Microsoft Wingdings doesn’t work either !

    Some of these non-latin fonts approach the type of grapheme being considered.

    (working on it.. working on it… )

    Reply

  12. Bob Bain’s avatar

    Stilgherrian’s comment that “grapheme” is indeed a wond’rous word…

    reference http://en.wikipedia.org/wiki/Grapheme

    “A grapheme (from the Greek: γράφω, gráphō, “write”) is the fundamental unit in written language. Examples of graphemes include alphabetic letters, Chinese characters, numerical digits, punctuation marks, and all the individual symbols of any of the world’s writing systems.”

    noting that “Alphabet” is an interesting word too, and ONE WHICH DESERVES FURTHER CONSIDERATION” appears to indicate that we can ignore Chinese characters, numerical digits, punctuation marks, and the odd assortment of individual symbols of any of the word’s writing sytems.

    This rules out such things as Ideograms http://en.wikipedia.org/wiki/Ideogram

    “An ideogram or ideograph (from Greek ἰδέα idea “idea” + γράφω grafo “to write”) is a graphic symbol that represents an idea or concept. Some ideograms are comprehensible only by familiarity with prior convention; others convey their meaning through pictorial resemblance to a physical object, and thus may also be referred to as pictograms”

    Over the coming weeks I’m concentrating on fonts ! :-)

    Reply

  13. Stilgherrian’s avatar

    @Bob Bain: It’s good that you’re clarifying all the terms — “alphabet”, “grapheme”, “ideogram”, “font”… but pursuing fonts as such is the wrong path. This is a new, unique script, and I reckon the picture in this page would be the only example on the entire Internet.

    I won’t say any more just now, because someone’s high school English class will be having a go at this soon. Not too many clues!

    Reply

  14. Bob Bain’s avatar

    Fonts are interesting in their own right and I am delving into aspects of Microsoft Word at Penrith Valley Seniors Computing Club – so even though fonts may not have any direct relevance to this puzzle I will be looking at fonts in the ensuing weeks.

    With regards to an aspect of alphabets we the puzzle solvers are overlooking I digged this out from the Interernet last night.

    http://www.answers.com/topic/history-of-the-alphabet

    No doubt the English class mentioned in your entry this morning can find the aspect of alphabets we seem to be overlooking noting that alp = “ox” and bet = “house” in the Proto-Canaanite invocation of this phenomena.

    I await insight from those much younger than myself !

    Reply

  15. Quatrefoil’s avatar

    Ok, I think I understand the principle, but then I should.

    I wonder whether you had to go to the library to sort this out back when it was first set?

    I’m up for the challenge.

    Reply

  16. Stilgherrian’s avatar

    @Quatrefoil: Glad you’re joining the fray. No, I didn’t need to go to the library.

    Reply

  17. Eric TF Bat’s avatar

    Interesting: the Tolkien script uses those accent-like dots and squiggles for vowels, so for example the first word, transliterated as Ennyn, is really [n][n] with an [underlined acute] meaning [e], and a [pair of dots] meaning [y]. I speculated there was something similar in your example: the first symbol (top left), looking like an F with two dots, recurs in a bunch of places, and there are also similar ones with different dots and frou-frou, like the second symbol on the fourth row. I’m dividing the larger symbols into what I think are letters — for example, the first one on the bottom line may be one of those F things with one dot and a loop, preceded by a smaller L shape. Two letters? Three, counting the loop+dot as a vowel? Perhaps. I should really be working…

    Reply

  18. Francis’s avatar

    Argh. I used to have a key to this. Possibly still do, probably in the back of some old mathematics notes. From 1978. My goodness this brings back very old memories. I wonder if I can remember the logic behind the script. As I look at it some of it comes back to me. I remember when Danny was creating it.

    Reply

  19. Bob Bain’s avatar

    Stilgherrian may not have found it necessary to go to a library to solve this as I get the impression that he is a fan of this type of fiction and so has a head-start over those of us who haven’t quite got the hang of it. I have therefore been delving into the works of Ursula Le Guin and have discovered that in the Wizard of Earthsea Le Guin introduced a “special language”

    http://stuffedhead.wordpress.com/2009/03/21/88/

    Part I: Magic Controlled by Speech

    One of the most well known aspects of the Earthsea series is the magic system. Le Guin created a system where wizards use a special language called the Old Speech to work magic. As a passage from A Wizard of Earthsea explains,

    “In the world under the sun, and in the other world that has no sun, there is much that has nothing to do with men and men’s speech, and there are powers beyond our power. But magic, true magic, is worked only by those beings who speak the Hardic tongue of Earthsea, or the Old Speech from which it grew.”

    Now if a person has read Le Guin’s works then he or she would have a head start over the rest of us methinks…

    Reply

  20. Bob Bain’s avatar

    Update.. checked out Le Guin in the Galaxy Bookshop. She wrote verse about hexagrams which leads me to the I Ching.

    http://en.wikipedia.org/wiki/I_Ching

    The text of the I Ching is a set of oracular statements represented by 64 sets of six lines each called hexagrams (卦 guà). Each hexagram is a figure composed of six stacked horizontal lines (爻 yáo), each line is either Yang (an unbroken, or solid line), or Yin (broken, an open line with a gap in the center). With six such lines stacked from bottom to top there are 26 or 64 possible combinations, and thus 64 hexagrams represented.

    The oracular interpretation of the symbolic language based on trigram symbols formed from yang and yin components is well known. However, the inherent numerical language of line change and non-change is relatively unknown.

    ==================

    ah hah ! Progress !!

    Reply

  21. Quatrefoil’s avatar

    I’m making a very different set of assumptions to yours Bob – I think it’s a whole lot simpler than that. I don’t think it’s a numerically based script – I did think about twig runes which work as a counting code, but I don’t think it’s that.

    I have now eliminated quite a few possibilities for words, but haven’t found any definite combinations.

    I’m proceding on a largely grammatical/linguistic basis.

    Reply

  22. Quatrefoil’s avatar

    And it would be very useful to know a tense, but that would feel like cheating.

    Reply

  23. Stilgherrian’s avatar

    @Eric TF Bat: Tolkien deliberately designed his languages and writing systems so they were plausible within his pre-history. As I’m sure you know, the narrative of The Lord of the Rings sits at the end of the age of Elves, Dwarves and Hobbits and the beginning of the age of Men.

    In Tengwar, the script I showed before, vowels are indicated much as in Tibetan and other Brahmi-derived scripts. A similar system is used in Thai today.

    ตับหวานอร่อยมากๆ!

    Is any of that a clue? I have no idea.

    @Francis: If you do find the key, please let me know. I’d love to see the original explanation. But email it to me, don’t post it here. Not yet, anyway.

    @Bob Bain and @Quatrefoil: There’s no need to over-complicate it.

    Reply

  24. Quatrefoil’s avatar

    It’s official — Stilgherrian is cleverer than I am. Two days have elapsed and I’m a long way from a solution — though I’ve tested a fair few possibilities.

    I’m not being terribly complicated about it — I only wanted to know the tense since it would give me a clue about expected distributions of word endings in English, but I think I’ve answered that question at least. And since it’s a literary text, I wouldn’t expect standard distribution patterns to necessarily hold true anyway.

    I now have about six pages of if/then statements, and a lengthy statement of assumptions. I just don’t have any positive correlations.

    Last night I dreamed of Thai elves with glottal stops (which is probably better than Mandalay).

    Reply

  25. Bob Bain’s avatar

    @Stilgherrian “Is any of that a clue? I have no idea”

    You make mention of vowels reminding us that alphabets are comprise vowels and consonants which may be the issue we have been overlooking when it comes to alphabets. As it notes on an Internet page somewhere there are a considerable number of “vowels” in the International Phonetic Alphabet — which as an aside is referred to on Ursula Le Guin’s home page. There are over 40 vowel sounds I believe.

    As I write the word “sounds” I am reminded that alphabets attempt to document sounds that can be produced by the human vocal chords — as opposed the symbolism of number found in mathematics.

    Perhaps we should be looking at intonation — the rise and fall of the strokes and attempting to reconcile this back to human speech — in English.

    An approach I took yesterday was to draw a thin blue line through the middle of each row of graphemes and examine the marks above and below the line.

    Quatrefoil appears to be attempting a solution using FORTAN :-) — a computer language which derived it’s name from FORMula Translator. Perhaps we should be using LISP — the language for Artificial Intelligence — “Lots of Irritating Stupid Parentheses”.

    Reply

  26. Stilgherrian’s avatar

    @Quatrefoil: The quote is in relatively straightforward English. No weird Ridley Walker or overly-poetic constructions.

    @Bob Bain: Getting warmer.

    Reply

  27. Francis’s avatar

    After a fruitless search I am keyless. I can even picture in my mind’s eye the old notebook I need – Stats 1H. But of all the old books in a box in the cellar, that one is missing. Found lots of books of notes with my handwriting in them, but no recollection of ever taking the course. Applicable Analysis anyone? Actually looked to be pertinent too.

    My very faint recollection of the explanation Danny gave me was indeed that relative position above and below the dominant line is a key part of the encoding. I think I have the main structure of the text sorted, but fitting the minor consonants isn’t so trivial. One clue to the provenance of the quote is that I will bet the text we see was inscribed in 1978. So that seriously limits the books from whence it came.

    Reply

  28. Stilgherrian’s avatar

    @Francis: You correctly date the puzzle. My fuzzy memory was placing it in 1979 at the very latest, but I reckon 1978 is closer to the mark. There is much in your memory which is correct.

    Not relevant to solving the puzzle, but I’ll mention it anyway: The image everyone looking at here is not in Danny’s original hand, but my own facsimile of it — because the original was damaged in some way. I doubt that it still exists.

    Reply

    1. Quatrefoil’s avatar

      @ Rob Bain:

      No – I’m almost illiterate in the languages of computers, but I do read a number of medieval and modern languages, and have some training in logic (which is where computing got it from in the first place).

      I’m just testing theories, and yes, I figured that we’re talking about sounds, not letters.

      So my reasoning goes:

      If Graph A = Sound B, then Graph C cannot equal Sound D, because it would result in either a set of sounds that don’t occur next to each other in English, or a grammatical problem – e.g. singular subject with a plural verb.

      The problem in that is working out what constitutes a single graph which signifies a phoneme.

      @ Stilgherrian:

      Yes, I was assuming ordinary English, but literary texts tend to have different rhythms from non-literary ones – such as doubling of adjectives, inversion of word order etc. If I remember correctly, though, le Guin isn’t an overly flowery writer.

      I suspect that I will figure it out eventually, using trial and error, but the monkeys with the typewriters might get there first (which isn’t to say that anyone who solves it before I do is a monkey).

      Reply

  29. Quatrefoil’s avatar

    @ Stilgherrian

    And since it’s your handwriting, can you tell me if a sharp bend is intentionally different from a smooth curve, or is that just a variation in the hand?

    Reply

  30. Stilgherrian’s avatar

    Fragment of the Script Challenge, showing that different shapes are significant @Quatrefoil: Here’s a fragment from the last line of the sample text. I’ve circled what I think is the variation in “sharp bend” versus “smooth curve” which you refer to. These represent two different things.

    Small details are, in general, important. Here, as in the rest of life. Gosh.

    Reply

  31. Bob Bain’s avatar

    @Quatrefoil

    I’m familiar with simple logic and there is a formal system of symbol manipulation that assists. Some of this can also be performed with a Venn Diagram

    http://en.wikipedia.org/wiki/Venn_diagram

    To test simple logical statements I find Venn diagrams and Truth tables understandable.

    However as far as “logic” is concerned as Wikipedia notes

    http://en.wikipedia.org/wiki/Logic

    “Just as we have seen there is disagreement over what logic is about, so there is disagreement about what logical truths there are”

    As far as digital computers are concerned the breakthrough in logic came with Boole and the application of Boolean algebra to electronic circuits

    http://en.wikipedia.org/wiki/Digital_electronics

    http://en.wikipedia.org/wiki/Boolean_algebra_(logic)

    “Boolean algebra (or Boolean logic) is a logical calculus of truth values, developed by George Boole in the 1840s”

    As explained to me once in an elementary electronics course it was put to George Boole in the 1800’s “Thats fine George but what possible use is this ?” and it wasn’t until the advent of the digital computer that Boolean algebra becomes valuable involving AND OR and NOR gates which control electronic circuitry. Boolean logic is also used in programming but this is a slightly different concept as it involves human beings who can and do (too frequently) get it wrong.

    We also have Fuzzy logic http://en.wikipedia.org/wiki/Fuzzy_logic

    This is widely used in the application areas listed – mostly in the electronic sphere.

    When it comes to digital computers attempting to emulate the human brain then we enter the world of Neural Networks and Genetic Algorithms

    http://en.wikipedia.org/wiki/Neural_network
    http://en.wikipedia.org/wiki/Genetic_algorithm

    In the end understanding the world is best left to the human brain which is the computer most used by our biological species.

    It is from this base I am attempting to understand the symbolism and am thinking of attempting to emulate it via sounds and exploring the wave forms produced and comparing this to the symbols in the script. All such wave forms are a subset of a sine curve I believe.

    http://en.wikipedia.org/wiki/Sine_wave

    Check out the diagrams in the article above and perhaps examine sine, square, triangle and sawtooth.

    I don’t believe any type of “formal logic” will work in this situation.

    After all that I am having difficulty too.

    @Stilgherrian Sir. Can we apply for a government grant to help solve this ?

    Reply

  32. Stilgherrian’s avatar

    @Bob Bain: “Government grant”? To solve a recreational puzzle, based on a made-up writing system of no real value to anyone except a small group of people? Sure! The Australia Council would be your best bet.

    Reply

  33. Bob Bain’s avatar

    @stilgherrian There was a government grant ($20,000 I seem to recall) once awarded for the production of a rather glitzy pornographic magazine.

    It had elements of artistic merit.

    Reply

  34. Jason Langenauer’s avatar

    So, after a couple of weeks away, I’ve come back to have a look with fresh eyes.

    Two things jump out – Stilgherrian’s comment right at the top that it was “English, but written in a different alphabet”, and the continued hints about “alphabet”. So perhaps the alphabet used has graphemes for sounds that we normally might represent by a digraph in English – for example, “sh” or “ch”. Hmmm….

    But of particular interest in teh Wikipedia, is this article on Devanagari, the alphabet used to write Hindi, Urdu and Sanskrit. The article notes that the alphabet is “recognizable by a distinctive horizontal line running along the tops of the letters that links them together”. Now where have we seen that before?

    http://en.wikipedia.org/wiki/Devanagari

    Reply

  35. Jason Langenauer’s avatar

    Bah, it’s not used for Urdu at all. Some of this damned dust must have got into my brain…

    Reply

  36. Stilgherrian’s avatar

    @Jason Langenauer: There is some sense in what you write… though I doubt that Urdu will help you very much. Not that know anything about Urdu, or could be arsed searching for same.

    Reply