Script Challenge: can you figure this out?

I’ve decided that each weekend I’ll dig out an object or two from my more distant past and write about it. To kick things off, here’s a challenge which was originally created by the same chap who coined my name.

The text you can see in the image below (at least if you happen to be sighted) is in an unknown script. Your task is obvious, I think.

The only clues you have are that it’s a quote from a book by Ursula LeGuin and it’s nothing whatsoever to do with Tolkein.

Image of text in an unknown alphabet

Now originally I solved this in under 2 days, without the aid of computers or amphetamines. I reckon that in The Age of the Internet you can do better. I’ll negotiate a suitable prize for the first person who posts the solution.

104 Replies to “Script Challenge: can you figure this out?”

  1. @Eric TF Bat: Tolkien deliberately designed his languages and writing systems so they were plausible within his pre-history. As I’m sure you know, the narrative of The Lord of the Rings sits at the end of the age of Elves, Dwarves and Hobbits and the beginning of the age of Men.

    In Tengwar, the script I showed before, vowels are indicated much as in Tibetan and other Brahmi-derived scripts. A similar system is used in Thai today.


    Is any of that a clue? I have no idea.

    @Francis: If you do find the key, please let me know. I’d love to see the original explanation. But email it to me, don’t post it here. Not yet, anyway.

    @Bob Bain and @Quatrefoil: There’s no need to over-complicate it.

  2. It’s official — Stilgherrian is cleverer than I am. Two days have elapsed and I’m a long way from a solution — though I’ve tested a fair few possibilities.

    I’m not being terribly complicated about it — I only wanted to know the tense since it would give me a clue about expected distributions of word endings in English, but I think I’ve answered that question at least. And since it’s a literary text, I wouldn’t expect standard distribution patterns to necessarily hold true anyway.

    I now have about six pages of if/then statements, and a lengthy statement of assumptions. I just don’t have any positive correlations.

    Last night I dreamed of Thai elves with glottal stops (which is probably better than Mandalay).

  3. @Stilgherrian “Is any of that a clue? I have no idea”

    You make mention of vowels reminding us that alphabets are comprise vowels and consonants which may be the issue we have been overlooking when it comes to alphabets. As it notes on an Internet page somewhere there are a considerable number of “vowels” in the International Phonetic Alphabet — which as an aside is referred to on Ursula Le Guin’s home page. There are over 40 vowel sounds I believe.

    As I write the word “sounds” I am reminded that alphabets attempt to document sounds that can be produced by the human vocal chords — as opposed the symbolism of number found in mathematics.

    Perhaps we should be looking at intonation — the rise and fall of the strokes and attempting to reconcile this back to human speech — in English.

    An approach I took yesterday was to draw a thin blue line through the middle of each row of graphemes and examine the marks above and below the line.

    Quatrefoil appears to be attempting a solution using FORTAN 🙂 — a computer language which derived it’s name from FORMula Translator. Perhaps we should be using LISP — the language for Artificial Intelligence — “Lots of Irritating Stupid Parentheses”.

  4. @Quatrefoil: The quote is in relatively straightforward English. No weird Ridley Walker or overly-poetic constructions.

    @Bob Bain: Getting warmer.

  5. After a fruitless search I am keyless. I can even picture in my mind’s eye the old notebook I need – Stats 1H. But of all the old books in a box in the cellar, that one is missing. Found lots of books of notes with my handwriting in them, but no recollection of ever taking the course. Applicable Analysis anyone? Actually looked to be pertinent too.

    My very faint recollection of the explanation Danny gave me was indeed that relative position above and below the dominant line is a key part of the encoding. I think I have the main structure of the text sorted, but fitting the minor consonants isn’t so trivial. One clue to the provenance of the quote is that I will bet the text we see was inscribed in 1978. So that seriously limits the books from whence it came.

  6. @Francis: You correctly date the puzzle. My fuzzy memory was placing it in 1979 at the very latest, but I reckon 1978 is closer to the mark. There is much in your memory which is correct.

    Not relevant to solving the puzzle, but I’ll mention it anyway: The image everyone looking at here is not in Danny’s original hand, but my own facsimile of it — because the original was damaged in some way. I doubt that it still exists.

    1. @ Rob Bain:

      No – I’m almost illiterate in the languages of computers, but I do read a number of medieval and modern languages, and have some training in logic (which is where computing got it from in the first place).

      I’m just testing theories, and yes, I figured that we’re talking about sounds, not letters.

      So my reasoning goes:

      If Graph A = Sound B, then Graph C cannot equal Sound D, because it would result in either a set of sounds that don’t occur next to each other in English, or a grammatical problem – e.g. singular subject with a plural verb.

      The problem in that is working out what constitutes a single graph which signifies a phoneme.

      @ Stilgherrian:

      Yes, I was assuming ordinary English, but literary texts tend to have different rhythms from non-literary ones – such as doubling of adjectives, inversion of word order etc. If I remember correctly, though, le Guin isn’t an overly flowery writer.

      I suspect that I will figure it out eventually, using trial and error, but the monkeys with the typewriters might get there first (which isn’t to say that anyone who solves it before I do is a monkey).

  7. @ Stilgherrian

    And since it’s your handwriting, can you tell me if a sharp bend is intentionally different from a smooth curve, or is that just a variation in the hand?

  8. Fragment of the Script Challenge, showing that different shapes are significant @Quatrefoil: Here’s a fragment from the last line of the sample text. I’ve circled what I think is the variation in “sharp bend” versus “smooth curve” which you refer to. These represent two different things.

    Small details are, in general, important. Here, as in the rest of life. Gosh.

  9. @Quatrefoil

    I’m familiar with simple logic and there is a formal system of symbol manipulation that assists. Some of this can also be performed with a Venn Diagram

    To test simple logical statements I find Venn diagrams and Truth tables understandable.

    However as far as “logic” is concerned as Wikipedia notes

    “Just as we have seen there is disagreement over what logic is about, so there is disagreement about what logical truths there are”

    As far as digital computers are concerned the breakthrough in logic came with Boole and the application of Boolean algebra to electronic circuits

    “Boolean algebra (or Boolean logic) is a logical calculus of truth values, developed by George Boole in the 1840s”

    As explained to me once in an elementary electronics course it was put to George Boole in the 1800’s “Thats fine George but what possible use is this ?” and it wasn’t until the advent of the digital computer that Boolean algebra becomes valuable involving AND OR and NOR gates which control electronic circuitry. Boolean logic is also used in programming but this is a slightly different concept as it involves human beings who can and do (too frequently) get it wrong.

    We also have Fuzzy logic

    This is widely used in the application areas listed – mostly in the electronic sphere.

    When it comes to digital computers attempting to emulate the human brain then we enter the world of Neural Networks and Genetic Algorithms

    In the end understanding the world is best left to the human brain which is the computer most used by our biological species.

    It is from this base I am attempting to understand the symbolism and am thinking of attempting to emulate it via sounds and exploring the wave forms produced and comparing this to the symbols in the script. All such wave forms are a subset of a sine curve I believe.

    Check out the diagrams in the article above and perhaps examine sine, square, triangle and sawtooth.

    I don’t believe any type of “formal logic” will work in this situation.

    After all that I am having difficulty too.

    @Stilgherrian Sir. Can we apply for a government grant to help solve this ?

  10. @stilgherrian There was a government grant ($20,000 I seem to recall) once awarded for the production of a rather glitzy pornographic magazine.

    It had elements of artistic merit.

  11. So, after a couple of weeks away, I’ve come back to have a look with fresh eyes.

    Two things jump out – Stilgherrian’s comment right at the top that it was “English, but written in a different alphabet”, and the continued hints about “alphabet”. So perhaps the alphabet used has graphemes for sounds that we normally might represent by a digraph in English – for example, “sh” or “ch”. Hmmm….

    But of particular interest in teh Wikipedia, is this article on Devanagari, the alphabet used to write Hindi, Urdu and Sanskrit. The article notes that the alphabet is “recognizable by a distinctive horizontal line running along the tops of the letters that links them together”. Now where have we seen that before?

  12. Bah, it’s not used for Urdu at all. Some of this damned dust must have got into my brain…

  13. @Jason Langenauer: There is some sense in what you write… though I doubt that Urdu will help you very much. Not that know anything about Urdu, or could be arsed searching for same.

  14. Wow. This has been sitting here for a long time and still no headway. Stil, can you give everybody either the answer or another large hint to get it going?

  15. @Daniel Edmonds: I’ll ponder your question on the weekend, Daniel. I must admit, though, there’s a lot of really good clues here already which people seem to have missed.

  16. My understanding of it is that each character represents a word. The characters are geometric, the letters are represented by the lines branching off of the middle line. I figure this because many of the characters have similar ‘parts’ to them (such as the wedge < on the last character on the first line, amongts others, the '4' turn which is juxtaposed to the loopy turn, and of course the dot).

    I figure the first character ("double-dot f") and the fifth character (wedge-f) are often-used words such as 'the', 'in' or 'of'. That the characters are words in themselves is supported by several character with a huge number of strokes (far too many for it to realistically be a single letter).

    Your continued hint at 'alphabet' makes me think that in english, our alphabet is one letter-per-sound. But in other alphabets, the characters represent groups of sounds, whole words, meanings. So that supports my guess that the individual characters don't merely represent a letter. Also, the characters are not groups together in any way to represent words.

  17. @Daniel Edmonds: You’ve got some good thinking in that analysis, though not all of it is correct. The sample text is a series of words, in English, separated by white space. And the analysis of word- and letter-frequency is indeed an important tool in cryptanalysis.

    For example, single-alphabet letter-substitution ciphers such as the Caesar cipher, where a given letter of the alphabet is always substituted with another, and always the same letter, are easily broken by remembering that the frequency of letters in typical English text typically runs ETAOINSHRDLUCYMFWP. Or thereabouts.

    You’re right in saying that different alphabets transcribe sounds in different ways. But you’re wrong in saying that when English is transcribed in the Roman alphabet, the mapping is always one letter per “sound”. It’s better to use the technical word linguists use when discussing phonetics: “phoneme”.

    For example, the musical verb “sing” is usually spelled like that, with four letters in the Roman transcription of English. But in International Phonetic Alphabet is has only three letters — only one for the final “ng” because that’s considered to be one phoneme, the so-called “dorsal nasal velar“.

    I’d show you the actual IPA rendering but I’m having trouble copy-and-pasting it into WordPress and it’s too early for me to be bothered working out how to do that. However any decent dictionary should show you the IPA transcription.

    I reckon that’s a lot of clues all at ones, given what else I’ve said on this topic.

  18. @Stil

    Thanks for clearing up the difference between ‘sound’ and ‘phoneme’, I should have used that word seeing other people had used it.

    Since you keep showing us the difference between ‘sound’ and ‘phoneme’, I’m led to think that this is extremely important. I’m not sure if it’s the correct word, but I’ll use ‘glyph’ to represent the different partitions of a grapheme. One glyph represents either one phoneme (ie. ‘th’ in ‘then’, a in ‘father’), or a group of phonemes, but i’m want to suspect the former, because I don’t think there are enough glyphs to represent all possible groupings of phonemes.

    The center-line is obviously releavant, because sometimes a glyph crosses that line, or other times merely sits under or over it. I am not going to count the center-line as a possible phoneme as it exists in every grapheme.

    The one thing I’m having difficulty in is separating the glphs. Many are easy (the jagged curve in L1G3 (Line 1 Grapheme 3) and at the beginning of L1G7, amongst others). However some are difficult – take L1G1 – not counting the centerline, there is a south-west quadrant dash, a north-east quadrant dash, a vertical line and two dots – the dots are probably vowel indications, given your love of qwenya and sindarin, but I’m unsure if each of these is a phoneme in itself, or perhaps joins with another stroke to create a phoneme (ie. the verticle and southwest dash could join to become ‘d’).

    But I figure that L1G3 is made up of the following glyphs (i’ve separated them):; Perhaps glyph 2 & 3 belong together as one glyph?

    [Stilgherrian writes: To save you having to click through, here’s Daniel’s (incorrect) breakdown of L1G3.]

  19. @Daniel Edmonds: There’s two reasons that I specifically use the word “phoneme”, and only one of them is that I formally studied linguistics and therefore try to stick to the correct technical terminology.

    I was going to say that another word that comes to mind is “cursive“, but having read that Wikipedia entry I think that might be misleading.

    Your breakdown of L1G3 has yielded too many pieces.

  20. Far too many or just one or two too many? I can see now that pieces two and three in my picture are in fact one piece.

    Regardless, I don’t see myself getting anywhere with this one. There are far too many loaded assumptions for me to get my head around. I’ve sat here for half an hour staring it and now have more questions than answers.

    Why does the down-ward squiggle in L1G1 sometimes break the center-line, but other times like L4G3 (the second instance of the glyph) the centerline continues?

    What difference does it make when a glyph starts below the line, or when the same glyph goes through the line, or when it’s not even on the line?

    Are the dots the same glyph, or different glyphs when they’re in different places?

  21. A very vague memory (from 30 years ago.) This wasn’t designed to be a code. It was designed to be beautiful typography. So ligands are a reasonable thing. It was designed by a computer scientist. There is underlying structure and rationale to it. There is beauty in that too.

    I’m trying to avoid looking at it again. Too much other stuff to worry about without going down that slippery slope.

  22. @Francis: Absolutely everything you say there is true, except that the word you were looking for is ligature rather than ligand.

    While this isn’t a code intended to hide meaning, but a script intended to transmit meaning, some techniques of crypytanalysis can help. Turing would have noted that they’re all codes anyway.

    @Daniel Edmonds: Your reading of L1G3 is now correct. You just broke it up into one more piece that was correct, as you suspected to begin with.

    I’m reluctant to give too many more clues, as the conversation already links to everything you’d need, including a chart that… well, perhaps that’s saying too much. But here’s three little tidbits to help you along.

    1. You’re asking the right kind of questions. Everything has meaning, except for the things that don’t. Make a list, and cross off one item. You have already mentioned the item.
    2. While I did say that detailed consideration of Tolkien’s Tengwar script would take you down the wrong path, Tengwar and this script have one specific feature in common. If George Bernard Shaw were still alive, he’d know what I was referring to.
    3. “And thirteen shillings?”

    That’s your ration for the day.

  23. Just in case anybody is reading this while I’m working on the next bit, Stil’s reference to George Bernard Shaw is a reference to the Shavian Alphabetic, a phonemic/phonetic alphabet. At first I thought it was in reference to the diacritics that the tengwar script uses to denote vowels, but the Shavian alphabet does not have these.

    Something interesting about the Shavian script is that whether the letter is written above or below the centerline denotes if it is a hard or a soft sound (ie. the ‘th’ and ‘they’ and ‘thick’).

    The Elvish language is also phonetic, and uses a similar system to change a hard sound to a soft sound. From what I have briefly read, either a line is added or the number is spun around…

    I have to look more into this.

    ‘Everything has meaning’ is most likely in reference to that every stroke has meaning, except for one, the centerline. I’m trying to make a list, but some characters are still hard to separate the different ‘ligatures’. I still am not sure about L1G1 – not counting the middle line, there are three lines + the two dots. Is this three phonemes? four? More? Less? Gah. L1G2 is similary difficult – I’m counting three ligatures – the first vertical line, the second verticle line with the ‘4’ loop and the two dots – three phonemes?

    I’ll give it to you Stil. I’m hooked.

  24. @Daniel Edmonds: I’m glad you got the Shaw reference. I won’t add anything just for the moment, because I think there’s enough pieces to play with for the time being.

Comments are closed.