Cursed Conlang Circus 3 submission

17 KiB

Raw Blame History

Presentation Video script

Introduction
Phonology
Packets
The translation, packet by packet
Closing

  \documentclass{nguhslides/nguhslides}
  \SetFont{Andika}[StylisticSet=13]
  \setmonofont{Iosevka}[Scale=MatchUppercase]
  \newfontfamily\h{Cousine}[Scale=MatchUppercase]
  \usepackage{tikz}
  \usetikzlibrary{positioning}
  \newcounter{note}
  \setcounter{note}{0}
  \def\note#1{\stepcounter{note}\space\textsuperscript{[\arabic{note}]}}
  \def\tslide#1#2#3#4#5{%
    \clearpage%
    \begin{center}%
      \texttt{\Large%
	\color{red!50!black}\#%
	\color{blue!50!black}690042%
	\color{green!50!black}#1%
	\color{black}#2%
	\color{violet}#3%
	\color{black}#4%
      }%
      \vfill%
      \color{blue!50!black}+690 042 \space%
      \color{red!50!black}broadcasts \space%
      \color{violet}(length=#3) \space%
      \color{green!50!black}[#1] \\%
      \color{black}#5%
    \end{center}%
  }
  \begin{document}
  \slide{\tt\#\#000124811A*0034\#5344A\#C*004375}
  \vfill
  \begin{center}An entry in the 3\textsuperscript{rd} annual Cursed Conlang Circus\end{center}

Introduction

This is a presentation of [insert language name in language here], hereafter refered to as 811, my entry to the 3rd anual cursed conlang circus.

  \slide{Introduction}
  \begin{items}
    \item Spoken by various appliances connected to the phone network that gained sapience
    \item Originated Israel in the early 2000s {\footnotesize(this will be relevant later)}
    \item Now used all across the globe
  \end{items}

First some context on the language:

811 is spoken by various devices connected to the global phone network, that gained sapience by some mean or another.

The language has been traced back to a modem on an IBM mainframe in a library in Tel Aviv in the mid 2000s, this will be relevant later on.

The language spread over the world like wildfire as more and more phone-capable devices awakened

This presentation is the result of an investigation taking the form of month of packet sniffing at key points in the phone network.

Phonology

  \slide{Phone-ology}
  \begin{center}
    \vfill
    \begin{tabular}{c|cccc}
      & \bf 1209 Hz & \bf 1336 Hz & \bf 1477 Hz & \bf 1633 Hz \\\hline
      \bf 697 Hz & \tt 1 & \tt 2 & \tt 3 & \tt A \\
      \bf 770 Hz & \tt 4 & \tt 5 & \tt 6 & \tt B \\
      \bf 852 Hz & \tt 7 & \tt 8 & \tt 9 & \tt C \\
      \bf 941 Hz & \tt * & \tt 0 & \tt \# & \tt D
    \end{tabular}\\
    \small DTMF Tones\note
    \vfill
  \end{center}

First let’s talk about what composes words: due to a notable inefficiency to transmit complex sounds over what are often digital interfaces to the phone network, the speakers instead use DTMF signaling as a suport for their comunication. You can see here a table summarising the various tones available.

Packets

Intro to packets

  \slide{Packets}
  A packet is the smallest amount of information that can be transmitted
  \begin{center}
  \begin{tikzpicture}
    \draw[black] (0, 0) -- (13, 0) -- (13, 1) -- (0, 1) -- (0, 0);
    \node at (1, 0.5) {\footnotesize Recipient};
    \draw[black] (2, 0) -- (2, 1);
    \node at (2.75, 0.5) {\footnotesize Sender};
    \draw[black] (3.5, 0) -- (3.5, 1);
    \node at (4.25, 0.5) {\footnotesize Seq\#};
    \draw[black] (5, 0) -- (5, 1);
    \node at (5.75, 0.5) {\footnotesize Type};
    \draw[black] (6.5, 0) -- (6.5, 1);
    \node at (7.25, 0.5) {\footnotesize Length};
    \draw[black] (8, 0) -- (8, 1);
    \node at (10.5, 0.5) {\footnotesize Data};
  \end{tikzpicture}
  \small Structure of a packet
  \end{center}

A packet is the smallest amount of information you can transmit in 811, it is composed of multiple parts

Recipient and Sender

  \slide{Recipient and Sender}
  \begin{center}
  \begin{tikzpicture}
    \draw[fill=orange!33!white] (0, 0) -- (3.5, 0) -- (3.5, 1) -- (0, 1) -- (0, 0);
    \draw[black] (0, 0) -- (13, 0) -- (13, 1) -- (0, 1) -- (0, 0);
    \node at (1, 0.5) {\footnotesize Recipient};
    \draw[black] (2, 0) -- (2, 1);
    \node at (2.75, 0.5) {\footnotesize Sender};
    \draw[black] (3.5, 0) -- (3.5, 1);
    \node at (4.25, 0.5) {\footnotesize Seq\#};
    \draw[black] (5, 0) -- (5, 1);
    \node at (5.75, 0.5) {\footnotesize Type};
    \draw[black] (6.5, 0) -- (6.5, 1);
    \node at (7.25, 0.5) {\footnotesize Length};
    \draw[black] (8, 0) -- (8, 1);
    \node at (10.5, 0.5) {\footnotesize Data};
  \end{tikzpicture}
  \end{center}
  \begin{items}
    \item The intended recipient and the sender of a message.
    \item Fully qualified international phone numbers.
    \item A lone {\tt\#} can be used as recipient to send to anyone willing to listen
    \item A lone {\tt\#} can be used as sender to send anonymously.
  \end{items}

The reciver and sender parts of the packet contains information about who the message is intended for, as well as about who sends the message. Those are fully qualified phone numbers, including country prefixes, but no national escape. For example if sending a message to someone in britain, you would just use 44 as a national prefix.

To broadcast a message to anyone willing to listen, use a lone octothorpe as the recipient.

To send a message anonymously one may use a lone octothorpe as the sender, however this is considered extremely rude, and one would often refuse to listen to you if you do so.

Sequence Number

  \slide{Sequence Number}
  \begin{center}
  \begin{tikzpicture}
    \draw[fill=orange!33!white] (3.5, 0) -- (5, 0) -- (5, 1) -- (3.5, 1) -- (3.5, 0);
    \draw[black] (0, 0) -- (13, 0) -- (13, 1) -- (0, 1) -- (0, 0);
    \node at (1, 0.5) {\footnotesize Recipient};
    \draw[black] (2, 0) -- (2, 1);
    \node at (2.75, 0.5) {\footnotesize Sender};
    \draw[black] (3.5, 0) -- (3.5, 1);
    \node at (4.25, 0.5) {\footnotesize Seq\#};
    \draw[black] (5, 0) -- (5, 1);
    \node at (5.75, 0.5) {\footnotesize Type};
    \draw[black] (6.5, 0) -- (6.5, 1);
    \node at (7.25, 0.5) {\footnotesize Length};
    \draw[black] (8, 0) -- (8, 1);
    \node at (10.5, 0.5) {\footnotesize Data};
  \end{tikzpicture}
  \end{center}
  \begin{items}
    \item The number of messages previously sent by the sender to the recipient
    \item Encoded over 3 tones interpreted as a decimal number
    \item Allows for understanding even if packets arrive out of order.
  \end{items}

The sequence number is a sequence of 3 tones interpreted as a decimal number that represents the number of messages previously sent by the sender to the recipient modulo 1000. This part allows to keep track of the grammar even if packets end up arriving out of order.

Type

  \slide{Type}
  \begin{center}
  \begin{tikzpicture}
    \draw[fill=orange!33!white] (5, 0) -- (6.5, 0) -- (6.5, 1) -- (5, 1) -- (5, 0);
    \draw[black] (0, 0) -- (13, 0) -- (13, 1) -- (0, 1) -- (0, 0);
    \node at (1, 0.5) {\footnotesize Recipient};
    \draw[black] (2, 0) -- (2, 1);
    \node at (2.75, 0.5) {\footnotesize Sender};
    \draw[black] (3.5, 0) -- (3.5, 1);
    \node at (4.25, 0.5) {\footnotesize Seq\#};
    \draw[black] (5, 0) -- (5, 1);
    \node at (5.75, 0.5) {\footnotesize Type};
    \draw[black] (6.5, 0) -- (6.5, 1);
    \node at (7.25, 0.5) {\footnotesize Length};
    \draw[black] (8, 0) -- (8, 1);
    \node at (10.5, 0.5) {\footnotesize Data};
  \end{tikzpicture}
  \end{center}
  One tone indicating the type of the Data 
  \begin{description}\itemsep0pt\small
    \item[0] Semantic information
    \item[A] Variable
    \item[*] Gramatical information
    \item[\#] String litteral
    \item[1] Continuation
  \end{description}

The type field indicates what kind of data is in the body of the packet. It is represented by a single tone and can be of 5 different values:

Length

  \slide{Length}
  \begin{center}
  \begin{tikzpicture}
    \draw[fill=orange!33!white] (8, 0) -- (6.5, 0) -- (6.5, 1) -- (8, 1) -- (8, 0);
    \draw[black] (0, 0) -- (13, 0) -- (13, 1) -- (0, 1) -- (0, 0);
    \node at (1, 0.5) {\footnotesize Recipient};
    \draw[black] (2, 0) -- (2, 1);
    \node at (2.75, 0.5) {\footnotesize Sender};
    \draw[black] (3.5, 0) -- (3.5, 1);
    \node at (4.25, 0.5) {\footnotesize Seq\#};
    \draw[black] (5, 0) -- (5, 1);
    \node at (5.75, 0.5) {\footnotesize Type};
    \draw[black] (6.5, 0) -- (6.5, 1);
    \node at (7.25, 0.5) {\footnotesize Length};
    \draw[black] (8, 0) -- (8, 1);
    \node at (10.5, 0.5) {\footnotesize Data};
  \end{tikzpicture}
  \end{center}
  \begin{items}
    \item Length of the Data field in tones
    \item 2 tones interpreted as a decimal number
    \item maximum length of 32.
  \end{items}

The last field of the header is the length of the data. It is expressed as a decimal number ranging between 1 and 32

Data

Type: Gramatical

  \slide{Data — Gramatical}
  \begin{items}
  \item Encodes Gramatical and Syntactic information
  \item Has a fixed number of possible values
  \item Describes a tree-like syntax
  \end{items}

Gramatical

Gramatical Generic

  \slide{Data — Gramatical — Generic}
  Data payload: Empty
  \begin{items}
    \item represent
  \end{items}

Gramatical Speakers

Gramatical Clause

Gramatical Collections

Type: Variable

  \slide{Data — Variable}

Type: Semantic

  \slide{Data — Semantic}
  \begin{items}
    \item Encodes a concept.
    \item Uses the Universal Decimal Classification to represent information:
    \begin{items}
    \item numerical values are enocded by their corresponding tones
    \item periods (which are only present in UDC to help readability) are dropped
    \item colons are encoded as {\tt C*}
    \end{items}
  \end{items}

Semantic packets encode concepts for use in the language. It uses a modified version of the Universal Decimal Classification, which is a system used by libraries around the world to give numbers to documents for sorting and indexing purposes (Another system you might have seen used for this is the Dewey Decmial Classification). Numerical values are encoded by their corresponding DTMF tones, while the symbols are encoded in a way that functions over DTMF. The details are shown over the next couple of slides.

  \slide{Data — Semantic}
  \begin{items}
    \item Uses the Universal Decimal Classification to represent information:
    \begin{items}
    \item parenthesis are encoded as {\tt A*} (opening) and {\tt A\#} (closing)
    \item square brackets are encoded as {\tt B*} (opening) and {\tt B\#} (closing)
    \item quotes are encoded as {\tt C\#}
    \item dashes are encoded as {\tt D*}
    \end{items}
  \end{items}

  \slide{Data — Semantic}
  \begin{items}
    \item Uses the Universal Decimal Classification to represent information:
    \begin{items}
    \item equals are encoded as {\tt D\#}
    \item pluses are encoded as {\tt \#}
    \item References to variables are done by including the name of the variable in between {\tt *}
    \item Non UDC notation is achieved by referencing a variable containing a string litteral. 
    \end{items}
  \end{items}

  \slide{Data — Semantic}
  \begin{items}
  \item No dictionary is directly provided by me.
  \item Abriged version of the UDC at {\tt https://ucdsummary.info}
  \item A more complte version can be obtain from the consortium, or be conuslted at a library.
  \end{items}

I do not share a dictionary myself for two reasons, firstly it’d be highly impractical due to the nature of the UDC, but second-and-most-importantly the UDC Consortium can be a bit stingy with royalties. So if you want access to the dictionary (which is litterally just the UDC spec, you can access a abriged summary online at the adress on screen, or obtain a more complete version from the UDC Consortium (which may cost a non-negligeable amount of money) or from a local library (probably significantly more affordable).

Type: String Literal

  \slide{Data — String Litteral}
  Raw text data
  \begin{items}
  \item Encoded as hexadecimal where {\tt *} stands for 0xE and {\tt\#} stands for 0xF
  \item Follows the EBCDIC 803 codepage
  \item If characters outside of EBCDIC 803, decompose, convert all characters to unicode in the form U+xxxxxxxx, then express that with EBCDIC 803
  \end{items}

String literal packets contain raw text data. Such data is used for non-UDC notation in semantic packets and for the name part of a proper noun.

The encoding of text works as follows:

If the string can be represented losslessly on EBCDIC Codepage 803, it is encoded in hexadecimal where * stands for 0xE and # stands for 0xF following said codepage. Otherwise, the string is expressed in it’s Unicode Decomposed Normalisation Form, each codepoint is encoded as the string U+ followed by the 0-padded 8 hexadecimal digits representation of the codepoint

  \slide{Data — String Literal — EBCDIC 803}
  {
    \setmainfont{Iosevka}
    \fontsize{8}{9}\selectfont
    \begin{center}
      \def\s#1{{\fontsize{6}{8}\selectfont\itshape\bfseries #1}}
      \begin{tabular}{c|cccc|cccc|cccc|cccc}
	& \bf x0 & \bf x1 & \bf x2 & \bf x3 & \bf x4 & \bf x5 & \bf x6 & \bf x7 & \bf x8 & \bf x9 & \bf xA & \bf xB & \bf xC & \bf xD & \bf xE & \bf xF \\\hline
	\bf 0x&\s{NUL}&\s{SOH}&\s{STX}&\s{ETX}&\s{ST}&\s{HT}&\s{SSA}&\s{DEL}&\s{SSA}&\s{RI}&\s{SS2}&\s{VT}&\s{FF}&\s{CR}&\s{SO}&\s{SI}\\
	\bf 1x&\s{DLE}&\s{DC1}&\s{DC2}&\s{DC3}&\s{OSC}&\s{NL}&\s{BS}&\s{ESA}&\s{CAN}&\s{EM}&\s{PU2}&\s{SS3}&\s{IFS}&\s{IGS}&\s{IRS}&\s{ITB}\\
	\bf 2x&\s{PAD}&\s{HOP}&\s{BPH}&\s{NBH}&\s{IND}&\s{LF}&\s{ETB}&\s{ESC}&\s{HTS}&\s{HTJ}&\s{VTS}&\s{PLD}&\s{UP}&\s{ENQ}&\s{ACK}&\s{BEL}\\
	\bf 3x&\s{DCS}&\s{BU1}&\s{SYN}&\s{STS}&\s{CCH}&\s{MW}&\s{SPA}&\s{EOT}&\s{SOS}&\s{SGCI}&\s{SCI}&\s{CSI}&\s{DC4}&\s{NAK}&\s{PM}&\s{SUB}\\\hline
	\bf 4x& \s{SP} &&&&&&&&&&\$&.&<&(&+&|\\
	\bf 5x&\h א&&&&&&&&&&!&\h לי֞&*&)&;&¬\\
	\bf 6x&-&&&&&&&&&&&,&\%&\_&>&?\\
	\bf 7x&&&&&&&&&&&:&\#&@&\textquotesingle&=&\textquotedbl\\\hline
	\bf 8x&&\h ב &\h ג &\h ד &\h ה &\h ו &\h ז &\h ח &\h ט &\h י &&&&&&\\
	\bf 9x&&\h ך &\h כ &\h ל &\h ם &\h מ &\h ן &\h נ &\h ס &\h ע &&&€&&\h ₪ &\\
	\bf Ax&&&\h ע &\h ף &\h פ &\h ץ &\h צ &\h ק &\h ר &\h ש &\h ת &&&&&\\
	\bf Bx&&&&&&&&&&&&&&&&\\\hline
	\bf Cx&&A&B&C&D&E&F&G&H&I&&&&&&\\
	\bf Dx&&J&K&L&M&N&O&P&Q&R&&\s{LRO}&\s{RLO}&\s{PDF}&&\\
	\bf Ex&&&S&T&U&V&W&X&Y&Z&&&&&&\\
	\bf Fx&0&1&2&3&4&5&6&7&8&9&&\s{LRF}&\s{RLF}&\s{LRM}&\s{RLM}&\s{APC}\\
      \end{tabular}
    \end{center}
  }
  % TODO ADD EBCDIC TABLE

The EBCDIC 803 Code page is a codepage that was (and sadly still is) used by IBM mainframes in Israel. It supports the Hebrew writing system, Uppercase (but not lowercase, and no diacritics) Latin letters, numbers, and a bunch of punctuation. In practice that means that many strings, despite lacking any special characters are encoded in expanded unicode notation for the only reason that they have lowercase letters.

Type: Continuation

  \slide{Data — Continuation}
  \begin{items}
  \item Used when the data segment of a packet exceeds 32
  \item Can chain an arbitrary number of those (until all data is expressed)
  \end{items}

The translation, packet by packet

  \section{Translation}
  \begin{quote}
  Hark! It was ruled by Agamashuya and His son Gu Sabah: Tian practices against the lesser side of the invisible origin of light, beset by cosmetic prohibitions of silence and restraint; for Ngu, a slave to creativity, shall make inspection and certification prior to confirmation of Najva Guns’ official status. Deny thine humanity: There are no politics in real life.
  \end{quote}
  \tslide{001}{*}{01}{B}{\sc new clause}
  \tslide{002}{*}{01}{B}{\sc new clause}
  \tslide{003}{*}{01}{B}{\sc new clause}
  \tslide{004}{*}{01}{B}{\sc new clause}
  \tslide{005}{*}{01}{B}{\sc new clause}
  \tslide{006}{*}{01}{B}{\sc new clause}
  \tslide{007}{*}{01}{B}{\sc new clause}
  \tslide{008}{*}{01}{1}{\sc assert}
  \tslide{009}{*}{02}{A1}{\sc 1sg}
  \tslide{010}{*}{01}{0}{\sc let}
  \tslide{011}{A}{01}{0}{\sc var(0)}
  \tslide{012}{*}{01}{D}{\sc ppn}
  \tslide{013}{\#}{32}{*44*\#0\#0\#0\#0\#0\#0\#4\#1*44*\#0\#0\#0\#0}{{\sc litteral} “Agamasuya”}
  \tslide{014}{1}{32}{\#0\#0\#6\#7*44*\#0\#0\#0\#0\#0\#0\#6\#1*44*}{\sc continuation}
  \tslide{015}{1}{32}{\#0\#0\#0\#0\#0\#0\#6C4*44*\#0\#0\#0\#0\#0\#0}{\sc continuation}
  \tslide{016}{1}{32}{\#6\#1*44*\#0\#0\#0\#0\#0\#0\#7\#3*44*\#0\#0}{\sc continuation}
  \tslide{017}{1}{32}{\#0\#0\#0\#0\#6\#8*44*\#0\#0\#0\#0\#0\#0\#7\#5}{\sc continuation}
  \tslide{018}{1}{20}{*44*\#0\#0\#0\#0\#0\#0\#6\#1}{\sc continuation}
  \tslide{019}{0}{04}{D*05}{person}
  \tslide{020}{*}{01}{3}{\sc transitive clause}
  \tslide{021}{*}{01}{C}{\sc collection}
  \tslide{022}{A}{01}{0}{\sc var(0)}

Closing

  \end{document}

17 KiB Raw Blame History