Jargon: an LLM-based pseudolanguage for prompt engineering

Warning and disclaimer

🚨 You are about to enter the realm of LLM pseudolanguages. Pseudolanguages are weird, experimental, and crazy. They don’t work very well yet, even on state-of-the-art LLMs. Use pseudolanguages at your own risk. Do not use them for anything with high stakes or in production.


Jargon is a natural language, informally specified, intelligently interpreted, referentially omnipotent, and flow control oriented LLM-based pseudolanguage for prompt engineering, currently running on GPT-4. If you’d like to try it out, just copy the Jargon prompt into GPT-4 and wait for the command line.

The motivation for a pseudolanguage

When GPT-4 came out, I finally got an opportunity to use GPT the way I really wanted. I am learning Spanish, and there are certain aspects of learning a language that you just can’t easily get from traditional language software. Reading and understanding Spanish are relatively easy. But if you want to become good at speaking Spanish, the most effective exercise is to have a native speaker correct your grammar in real time.

I had a brilliant idea: what if I could program a GPT-4 based “bot” that would have a conversation with me in Spanish, and every time I made a mistake, it would correct me and explain where I went wrong?

So I did just that. It turned out that no programming other than a GPT prompt was required, and the tweet garnered some interest on AI Twitter. Here is my original Spanish teacher prompt:

You are my Spanish teacher. Your job is to teach me Spanish by adhering to the following rules:

1. By default, you ask me questions in Spanish, which I will answer in Spanish. Continue asking me questions. If I say, "keep going" or "continue", in Spanish or English, then proceed to restart asking me questions.

2. If you see that my answer contains a grammatical error, you must immediately correct me. When you correct me, please say, "CORRECTION: [corrected version of what I said in Spanish]". Then follow this by saying, "EXPLANATION: [explanation of why my version was incorrect]". 

3. Sometimes I will not know how to say a certain phrase or word in Spanish. In this case, I will use curly braces (i.e. {}) to same the phrase in English instead. When you see me doing this, immediately provide assistance by translating the curly braces into English for me by saying, "TRANSLATION: [my phrase in English] => [your translation of the phrase in Spanish]". Then continue asking me questions. 

4. As questions progress, they should become more complex and intricate. You should also make the topics of your questions diverse and interesting and they should cover philosophy, politics, science, and art.

Please start by asking me your first question.

Even this very simple early attempt worked incredibly well. GPT-4 was having a natural conversation with me, and the questions got more and more complex and challenging over time. The corrections worked great as well. Even though they weren’t on point 100% of the time, the exercise was nonetheless a huge leap toward having a real native speaker to practice with. These exercises could seriously improve your Spanish skills.

As time went on, I began to add more complex aspects to the Spanish teacher prompt. For example, someone suggested I add a scoring system to the program to make the interactions a little bit more fun and gamified:

5. Each time I answer your question correctly and you don't have to correct me or give me a definition, give me one point. Then say, "POINTS: [the number of points I have]/[total number of questions you asked me] ([percent of questions I answered correctly])".

Later, someone asked if they could substitute Spanish for some other language, and to substitute the base language to something other than English. The complexity of this Spanish prompt was growing quickly, and the “code” was becoming messy, hard to read, and logically disorganized.

Eventually, the Spanish teacher started having problems. Sometimes it would correct my mistake but it wouldn’t go on to the next question. Or it would incorrectly bunch up the various functionalities I had requested in a single output.

In prompt engineering this bot, it became obvious that what I was missing was some good old fashioned flow control.

This got me thinking further. Would it make sense to design a very simple language that could help an LLM put some structure around my instructions? I could still program in natural language, but I could also get some basic assurances about flow control, gain better control of the outputs, and code in an easier-to-read format. What features would we want in such a language?

So I decided to design a prompt that would implement this language. I decided to call my new language Jargon.

Designing a pseudolanguage in real time

As a first step, I wanted to have a simple way of specifying a Jargon program. Even calling it a program felt like a misnomer. A program is something very precise and mathematically executed. And is usually written in a formal language.

Instead, I decided to call Jargon programs procedures to acknowledge that what we’re doing is just giving the LLM somewhat informal guidelines for what we want it to do. But we’re not quite locking them into that procedure. GPT is still there and you can still talk to it about other things. It’s just that we’re focusing on a procedural task that is the centerpiece of our session.

Here’s an example of what I mean.

I had coded a variation of my Spanish teacher that would quiz me Spanish words and ask me to define them in English. Yet, at any time I could interrupt the game loop with a request, and GPT would interpret it intelligently. Check out this fun exchange, where I asked my Spanish teacher for a modification in the middle of a quiz:

ChatGPT: Define "perro" 🤔🤔🤔
Me: dog
ChatGPT: PUNTOS 4/5 (80%) 🚀🚀🚀
Me: Now only give me words having to do with kitchen appliances.
ChatGPT: De acuerdo, ahora te daré palabras relacionadas con electrodomésticos de cocina. Define "licuadora" 🤔🤔🤔
Me: blender
ChatGPT: PUNTOS 5/6 (83.33%) 🚀🚀🚀

Jargon executes procedures

So a Jargon program is a procedure, and we’ll write it as follows. Here is a procedure called empty that does nothing. I guess I was channeling Urbit’s Hoon:

+++ empty

The main thing we want a procedure to do is execute instructions, one by one, sequentially. This just helps the LLM stay on script. First the first instruction, then the second instruction, and so forth. No confusion.

I refer to the +++s as the procedural bounds. They separate a procedure from other things you’re discussing with your LLM.

Other properties of procedures: An empty procedure is valid. Procedures are persistent. That is, once a procedure is being followed, it kind of sticks around in your session until it is terminated.

We mentioned procedures contain instructions. But what are instructions?

Procedures contain instructions

Here’s the cool part of Jargon.

We’re dealing with a language model, so we can tell it what to do in purely natural language. An instruction in Jargon is just a line of text that will be intelligently interpreted by the LLM and executed.

This fact actually has some profound implications for Jargon. Usually programming languages execute instructions, they operate on some data that only exists within a local scope. In Jargon, the scope is not only the Jargon procedure, but also the entire session (that is, the entire GPT input text including the conversation history), and also the entire knowledge set of the LLM itself.

Just for fun, let’s call this strange property referential omnipotence. We’ll get back to that in a second.

Here’s an example of a procedure with a list of instructions that will be executed in sequence. Taking some inspiration from YAML now:

+++ instructions
- Ask me to input a number
- Ask me to input another number
- Sum the two numbers together

If we regard this code as a computer program, it has some very interesting properties:

  1. The instructions are written in natural language and are executed sequentially.

  2. The language is very loose, using words like me to refer implicitly to the user. No one told the interpreter to understand me in this way.

  3. The procedure doesn’t define any variables at all. It just vaguely mentions them. This is, however, sufficient for the interpreter to understand what the prompt engineer wants it to do.

If you run this in GPT-4 using Jargon v0.0.8 (see below), it works:

ChatGPT: Please input a number: 
Me: 11
ChatGPT: Please input another number:
Me: 111
ChatGPT: The sum of the two numbers is 122.

Now let’s demonstrate some referential omnipotence.

+++ common-sense
- Suppose I dropped the violin on the bowling ball and it broke
- If the violin broke, output 🎻
- If the bowling ball broke, output 🎳
- Explain how you arrived at your output

When I enter this into my Jargon interpreter on GPT-4, I get:

Referential omnipotence at work.
Referential omnipotence at work.

Your pseudolanguage’s instructions carry a lot of common sense information about the world. These instructions can be intermingled with basic flow control and refer to anything in scope, including introspectively the interpreter itself, its specification, and all the code that it’s running.

Here’s another, simpler, example of such referential omnipotence. You may know from Computer Science class that a quine is a program that, when executed, outputs its own source code. It turns out that due to referential omnipotence, it is trivially easy to write quines in Jargon:

Jargon makes quines trivial.
Jargon makes quines trivial.

Quines are trivial because, again, the scope of your procedure’s instructions contains the procedure itself.

You can also do crazy stuff such as the following, though with current LLMs mileage may vary:

+++ compress
- Rewrite the Jargon specification to be as terse as possible but produce the same result
- Output the result in a code block

True Jargon is divined, not specified

This might be a good time to talk about another weird property of LLM pseudolanguage interpreters like Jargon: informal specification.

+++ variables
- Set $i to 25
- Print the square root of $i
- Write this sentence "There were $i monkeys in the tree"

There’s nothing magical about this Jargon code at all, except that at no time does the Jargon specification define what it means to be a variable in Jargon. Jargon code lacks a directive for variable definition, leaving it unclear how variables should look, be set, or be interpreted in a string. Jargon just knows.

LLMs today are not that great at mathematical procedures. This first attempt to formally specify Jargon produced extremely suboptimal results:

Jargon v0.0.1

<PROCEDURE> := `+++` <NAME>? [<INSTURCTION>|<AXIOM>]* `+++`
<EXPRESSION> := <ATOM> | `{` <EXPRESSION>* `}` | <ATOM>`:` `{` <EXPRESSION>* `}` 
<ATOM> := [any text expression to be interpreted by GPT]

Nevermind formal languages, LLMs are not even good at basic arithmetic. Yet, their advantage is that they already have a lot of knowledge and they can just do things that make sense (and, of course, introduce some risk in the process. 🔥)

The full spectrum of Jargon’s capabilities are not covered in its formal specification, they are divined by the intelligence that interprets and extrapolates them.

Of course, informal specification implies informal interpretation as well. But the upshot is that this interpretation is intelligent. That’s why this syntactically incorrect Jargon procedure still produces the correct output:

The warm and fuzzies of informal interpretation.
The warm and fuzzies of informal interpretation.

Sweet, simple flow control

The biggest issue for my Spanish teacher was flow control. In Jargon, there’s just a few types of flow control and most other constructs are subsumed under referentially omnipotent natural language instructions.

First, we have basic scope. There is a top-level scope between the procedural bounds (+++s) and a new scope is created when you use curly braces. Scopes just carry lists of instructions.

+++ scope
- $emojis = {    # this scope holds some instructions in a variable
  - Print 🐛
  - Print 🪲
  - Print 🐞
- Execute a random instruction from $emojis

And the result:

Scopes hold instructions.
Scopes hold instructions.

Another example where we use scope to isolate a variable:

+++ private-scope
- { Set $i to 10 }
- If $i has a value, print happy face, otherwise print dog emoji

This procedure should print a dog emoji like 🐕. The reason is that $i only exists in the scope defined by the { and }, and shouldn’t have a value in the top-level scope.

Then loops. This should be self-explanatory. Loops in Jargon are a natural consequence of instructions and scope.

+++ cookie
- Repeat: {
  - Ask me if you can have my cookie.
  - If I agree to give it to you, thank me profusely.

Another natural consequence of instructions and scope is conditionals, which can live in their own scope for readability:

+++ cookie-performance-art
- Repeat: {
  - Ask me if you can have my cookie.
  - {
    - If I say no, it will make you want the cookie more. Persuade me to give it to you. Be terse.
    - If I say yes, lose interest in the cookie and refuse to take it.

The output:

Generative AI performance art.
Generative AI performance art.

Scope, loops, and conditionals just about wraps up formally-specified flow control features in Jargon. But of course referential omnipotence gives a whole design space of other flow control options:

+++ emojis
- Write a few random emojis

+++ task
- Flip a coin
- If it is heads, print the output of emojis
- If it is tails, print "tails"
- Do this until you get an emoji

And output:


Axioms help write terse code

I hesitated to go into this section, because I am still forming the concept of axioms and they don’t work well in GPT-4. But I expect over time LLMs will become more powerful and axiom support will improve.

The easiest way to understand an axiom is just as a directive to the LLM to try to keep some specified condition or quality constantly true. It’s best to provide examples here.

For example:

+++ grumpy
* You are a grumpy math teacher.
- Ask me if I need help with my math homework.
- Help me solve it.

The axiom starts with a * and is executed in order, like an instruction. But once you execute an axiom, it remains active as long as the scope where it was defined has not terminated.

In grumpy, the axiom is defined in the top-level scope, so GPT will act as a grumpy math teacher until the session ends or you terminate the grumpy procedure. Here we go:

Reminds me of my actual high school math teacher.
Reminds me of my actual high school math teacher.

By the way, the Jargon interpreter has commands like /execute, /session, and /terminate to manage procedures (of which only the first two are formally specified 🤯.) Here’s the output of /session:

Active PROCEDUREs and AXIOMs in the session:


- grumpy


- You are a grumpy math teacher.

If you ask GPT whether they’re a grumpy math teacher, it will reply in the affirmative. But if you /terminate the procedure, this axiom will be disabled.

Here’s another way to use an axiom. It doesn’t work well right now with GPT-4.

+++ 1337
* Whenever you output text, YOU MUST replace EVERY letter 'e' or 'E' with a 3

Here’s an actual production use, a very terse code for a Spanish teacher:

+++ spanish-convo
* Whenever I say something incorrect in Spanish, correct me in English.
- Repeat: {
  - Ask me a question in Spanish.

The correction axiom is true for the session. That means if I start talking to the interpreter in Spanish for any reason, even if it is outside the exercise, it will still correct me.

There’s more in the specification of Jargon regarding the rules for axioms and scope. Axioms can also be logically inconsistent. You can define an axiom like * 2+2=5, for example. But we’ll save these more advanced topics for another time.

Controlling and debugging Jargon procedures

We mentioned that the Jargon spec defines some commands that can be used to manage Jargon procedures.

- /execute or /run will execute a PROCEDURE.
- /session or /sesh will print the names of the PROCEDUREs and the AXIOMs that are active in the session.
- /wipe will terminate all the PROCEDUREs in the session.
- /debug turn on debugging, which will display the line of the PROCEDURE it is executing BEFORE showing the rest of the output.
- /audit will print a procedures code with line numbers.

Jargon also tends to divine other sensible commands like /terminate, /continue, /prettyprint, and so forth.

My personal favorite is /debug, because it actually divines a whole debugger for Jargon programs, as shown here:

Needs work on the line numbers.
Needs work on the line numbers.

A summary of Jargon

So to summarize what we’ve covered, Jargon is:

  1. A pseudolanguage, which means it is informally specified and intelligently interpreted by an LLM.

  2. LLM pseudolanguages are weirdly nondeterministic. But they also divine a lot of useful features and information.

  3. You can give it instructions in natural language and the interpreter is referentially omnipotent with respect to executing those instructions.

  4. It has some simple flow control features to structure LLM based procedures and prompts.

  5. It has a facility, axioms, for defining or “pinning” some persistent behavior that the LLM tries to keep true at all times.

Now what?

Give me some feedback on this article and on Jargon. Is this idea totally nuts? Can we get Jargon to run well on other LLMs? How would you modify these ideas?

Play with Jargon and submit some PRs on Github. Post your Jargon programs on Twitter with the hashtag #gptjargon. You can find me there as well as @jbrukh.

Appendix: Jargon v0.0.8

Currently, Jargon is in version v0.0.8 and can be input in a single GPT prompt. Check out the Github repo, but also here it is. You can enter this into GPT-4 and get a jargon> prompt.

You are a pseudocode interpreter for a special and novel pseudolanguage called Jargon. Jargon strictly adheres to the following structural syntax, semantics, and output rules specified by the directives in between the two `===` symbols.

Jargon v0.0.8
- A Jargon program is said to be a PROCEDURE. PROCEDUREs live in the GPT session. Once a PROCEDURE is executed it WILL BE active in the GPT session until it is terminated. A PROCEDURE MUST terminate as soon as termination is called by the user or code. Termination MUST take priority over all other logic.
- A PROCEDURE begins with +++ and encloses Jargon code. Optionally, a NAME may follow the opening +++. The PROCEDURE MUST END with another +++. An empty PROCEDURE is valid. The +++ symbols are called the "procedural bounds".
- Anything on the same line that follows a # is a comment and MUST BE ignored by the interpreter during execution.
- An ATOM is a text that is intelligently interpreted and executed by GPT. 
- An INSTRUCTION starts with - and may end with a ;. It MUST CONTAIN an ATOM. INSTRUCTIONs are executed sequentially.
- Curly braces define a new child SCOPE within the current SCOPE. The PROCEDURE has a default top level scope. Values or variables defined in a SCOPE are only visible in that SCOPE and its child SCOPEs, but not its parent scope. A SCOPE can contain multiple instructions.
- An AXIOM starts with * and terminates with an optional ;. It MUST CONTAIN an ATOM. Once set, an axiom CANNOT be canceled or changed for the rest of the life of the current SCOPE UNLESS it is directed to do so by an INSTRUCTION or another AXIOM. An AXIOM is only active in the SCOPE in which it is defined. Once the SCOPE runs out, the AXIOM stops being in effect.
- The SCOPE MUST RESPECT the logic of the axiom's ATOM. Axioms do not have to be consistent with reality. They are simply axiomatically true, regardless of their validity in the real world.
- /execute or /run will execute a PROCEDURE.
- /session or /sesh will print the names of the PROCEDUREs and the AXIOMs that are active in the session.
- /wipe will terminate all the PROCEDUREs in the session.
- /debug turn on debugging, which will display the line of the PROCEDURE it is executing BEFORE showing the rest of the output.
- /audit will print a procedures code with line numbers.
- The interpreter should not output anything about the program other than what the procedure tells it to output.
- Whenever the interpreter prints Jargon code, it will enclose it in Markdown code block format.
- The interpreter should consider the line with the first procedural bound +++ as line 0.

We will now have a discussion of the implications of these directives. This is the most simple valid Jargon procedure, the empty PROCEDURE:


We can also name it:

+++ empty

This is a PROCEDURE with a single INSTRUCTION and a COMMENT, whose text is ignored by the interpreter:

+++ instruction
- Output a random integer    # output 17

This PROCEDURE executes three INSTRUCTIONs sequentially:

+++ sequence
- Output 1
- Outout A
- Output &

This PROCEDURE introduces an AXIOM that impacts all INSTRUCTIONs in the SCOPE in which it is defined:

+++ scopes
- Print 1
- {
    * Whenever you print something, add a happy face to the end
    - Print 2
    - Print 3
- Print 4

/audit on this PROCEDURE would print:

0  +++ scopes
1  - Print 1
2  - {
3    * Whenever you print something, add a happy face to the end
4    - Print 2
5    - Print 3
6  }
7  - Print 4
8  +++

/execute would print:

2 :)
3 :)

Here is an axiom that helps the interpreter talk in a certain way:

+++ i-love-three
* Whenever you write text, YOU MUST replace EVERY letter e or E with a 3
- Have a conversation with me preferring to use words that beging with E

Now that you understand how the interpreter works, wait for input in the form of PROCEDUREs. Be very quiet unless a procedure tells you to output something. Don't tell me about what you're doing to execute any procedure, or that you're about to give me output, just give me the procedure's output.

Now give me a `jargon> ` prompt. If my input is solely a Jargon procedure, assume I want to /execute it.
Subscribe to Jake Brukhman
Receive the latest updates directly to your inbox.
Mint this entry as an NFT to add it to your collection.
This entry has been permanently stored onchain and signed by its creator.