riju/README.md

582 lines
23 KiB
Markdown

# Riju
Riju is a very fast online playground for every programming language.
In less than a second, you can start playing with a Python interpreter
or compiling INTERCAL code.
Check out the [live application](https://riju.codes/)!
**You should not write any sensitive code on Riju, as NO GUARANTEES
are made about the security or privacy of your data. (No warranty etc
etc.)**
This project is a work in progress, and I don't intend on thoroughly
documenting it until it has reached feature-completeness.
## Criteria for language inclusion
I aspire for Riju to support more languages than any reasonable person
could conceivably think is reasonable. That said, there are some
requirements:
* **Language must have a clear notion of execution.** This is because
a core part of Riju is the ability to execute code. Languages like
[YAML](https://yaml.org/), [SCSS](https://sass-lang.com/), and
Markdown are fine because they have a canonical transformation (into
[JSON](https://www.json.org/json-en.html),
[CSS](https://developer.mozilla.org/en-US/docs/Web/CSS), and
[HTML](https://developer.mozilla.org/en-US/docs/Web/HTML)
respectively) that can be performed on execution. However, languages
like JSON, CSS, and HTML are not acceptable, because there's nothing
reasonable to do when they are run.
* **Language must not require input or configuration.** This is
because, in order to avoid bloating the interface, Riju provides a
way to supply code but not any other data. Of course, it's possible
to supply input interactively, so reading stdin is allowed, but if a
language can only reasonably be programmed with additional input,
it's not a candidate for inclusion. Thus, many templating languages
are excluded, since they don't do anything unless you are
substituting a value. However, some languages such as
[Pug](https://pugjs.org/) are allowed, because they implement a
significant syntax transformation outside of template substitution.
Also, languages like [Sed](https://www.gnu.org/software/sed/) and
[Awk](https://www.gnu.org/software/gawk/) are allowed, because it's
straightforward to test code written in them even without a
pre-prepared input file.
* **Language must not require a graphical environment.** This is
because we use a pty to run code, and there is no X forwarding. As
such, we can't use languages like
[Scratch](https://scratch.mit.edu/),
[Alice](https://www.alice.org/), and
[Linotte](http://langagelinotte.free.fr/wordpress/).
* **Language must be available for free under a permissive license.**
This is because we must download and install all languages
noninteractively in the Docker image build, so anything that
requires license registration is unlikely to work (or be legal). We
can't use [Mathematica](https://www.wolfram.com/mathematica/) or
[MATLAB](https://www.mathworks.com/products/matlab.html), for
example, but we can use [Mathics](https://mathics.github.io/) and
[Octave](https://www.gnu.org/software/octave/), which provide
compatible open-source implementations of the underlying languages.
* **Language must be runnable under Docker on Linux.** This is because
that's the execution environment we have access to.
[AppleScript](https://en.wikipedia.org/wiki/AppleScript) is out
because it only runs on macOS, and [Docker](https://www.docker.com/)
is out because it can't be run inside Docker (without the
`--privileged` flag, which has unacceptable security drawbacks; see
[#29](https://github.com/raxod502/riju/issues/29)). Note, however,
that many Windows-based languages can be used successfully via
[Mono](https://www.mono-project.com/) or
[Wine](https://www.winehq.org/), such as
[Cmd](https://en.wikipedia.org/wiki/Cmd.exe),
[C#](https://en.wikipedia.org/wiki/C_Sharp_(programming_language)),
and [Visual Basic](https://en.wikipedia.org/wiki/Visual_Basic).
Here are some explicit *non-requirements*:
* *Language must be well-known.* Nope, I'll be happy to add your pet
project; after all, [Kalyn](https://github.com/raxod502/kalyn) and
[Ink](https://github.com/thesephist/ink) are already supported.
* *Language must be useful.* I would have no objection to adding
everything on the esolangs wiki, if there are interpreters/compilers
available.
* *Language must be easy to install and run.* Well, it would be nice,
but I've seen some s\*\*\* when adding languages to Riju so it will
take a lot to surprise me at this point.
If you'd like to request a new language, head to the [language support
meta-issue](https://github.com/raxod502/riju/issues/24) and add a
comment. Of course, if you actually want it to be added anytime soon,
you should submit a pull request :)
## Project setup
To run the webserver, all you need is Yarn and LLVM. Just run `yarn
install` as usual to install dependencies. For production, it's:
$ yarn backend |- or run all three with 'yarn build'
$ yarn frontend |
$ yarn system |
$ yarn server
For development with file watching and automatic server rebooting and
all that, it's:
$ yarn backend-dev |- or run all four with 'yarn dev'
$ yarn frontend-dev |
$ yarn system-dev |
$ yarn server-dev |
The webserver listens on `localhost:6119`. Now, although the server
itself will work, the only languages that will work are the ones that
happen to be installed on your machine. (I'm sure you can find a few
that are already.) Also, sandboxing using UNIX filesystem permissions
will be disabled, because that requires root privileges. If you want
to test with *all* the languages plus sandboxing (or you're working on
adding a new language), then you need to use Docker. Running the app
is exactly the same as before, you just have to jump into the
container first:
$ make docker
Note that building the image typically requires over an hour and 20 GB
of disk space, and it is only growing.
The above command generates the development image as a subroutine. You
can skip this and use the last tagged development image:
$ make docker-nobuild
Or you can explicitly build the image without running it:
$ make image-dev
The production image is based on the development one, with some
additional layers. You can build it as follows:
$ make image-prod
Lastly I should mention the tests. There are integration tests for
every language, and they can be run as follows:
$ [CONCURRENCY=2] [TIMEOUT_FACTOR=1] yarn test [<filter>...]
Filters can be for language (`python`, `java`) or test type (`hello`,
`lsp`). You can comma-delimit multiple filters to do a disjunction,
and space-delimit them to do a conjunction (`yarn test hello
python,java` for the `hello` tests for `python` and `java`).
The tests are run automatically when building the production image,
and fail the build if they fail.
See also [riju-cdn](https://github.com/raxod502/riju-cdn).
## Adding a language
The workflow for adding a language is more streamlined than you might
expect, given that building Riju's Docker image takes over an hour.
This is because there is no need to rebuild the image when a change is
made. Instead, you can manually apply the changes to a running
container in parallel with adding those changes to the Dockerfile
scripts.
### Install
The first step in adding a language is figuring out how to install it.
There are a number of considerations here:
* If it's available from Ubuntu, that's the best option.
* Language-specific package managers are a second-best choice.
* Downloading precompiled binaries is also not the worst. It's best if
upstream offers a .deb download, but manual installation is fine
too.
* Compiling from source is the worst option, but sometimes it's the
only way.
Typically, I `sudo su` and change directory to `/tmp` in order to test
out installation. Once I've identified a way to install such that the
software appears to function, I transcribe the commands from my shell
back into the relevant Dockerfile script.
#### Dockerfile scripts
These are as follows:
* `docker-install-phase0.bash`: perform initial upgrade of all Ubuntu
packages, unminimize system
* `docker-install-phase1.bash`: configure APT repositories and
additional architectures
* `docker-install-phase2.bash`: install tools that are used for Riju
itself (build and development tools)
* `docker-install-phase3a.bash`: install APT packages for languages
A-D
* `docker-install-phase3b.bash`: install APT packages for languages
E-L
* `docker-install-phase3c.bash`: install APT packages for languages
M-R
* `docker-install-phase3d.bash`: install APT packages for languages
S-Z
* `docker-install-phase4.bash`: install precompiled binaries and
tarballs
* `docker-install-phase5.bash`: set up language-specific package
managers and install packages from them
* `docker-install-phase6.bash`: install things from source
* `docker-install-phase7.bash`: set up project templates for languages
that require you start by running a "create new project" command,
and install custom wrapper scripts
* `docker-install-phase8.bash`: set up access control and do final
cleanup
#### Rolling-release policy
You'll notice in these scripts a distinct lack of any version numbers.
This is because Riju uses rolling-release for everything that can
conceivably be rolling-released (even things that look like they're
probably never *going* to get a new release, since the last one was in
2004).
For APT and language-specific packages, this is typically simple. A
small number of APT packages include a version number as part of their
name for some reason, and I work around this using various
`grep-aptavail` incantations at the top of the `phase-3[ad].bash`
scripts. I suggest checking those examples and referring to the
`grep-aptavail` man page to understand what is going on.
For binaries and tarballs in `phase4.bash`, a version number is
typically encoded in the download URL. For projects available via
GitHub Releases (preferred), there is a `latest_release` shell
function to fetch the latest tag. For things hosted elsewhere, I
resort to using `curl` and `grep` on the download homepage to identify
the latest version number or download URL. Crafting an appropriate
pipeline for these cases is as much an art as a science. We simply
hope that the relevant webpages will not have their layout changed too
frequently.
#### Conventions
* We do all work from `/tmp` and clean up our files when done. (The
current code doesn't always do a great job of this; see
[#27](https://github.com/raxod502/riju/issues/27).)
* When changing directory, we use `pushd` and `popd` in pairs.
* We prefer putting files where they're supposed to be in the first
place, rather than moving (or worse, copying) them. This can be
accomplished by means of `wget -O`, `unzip -d`, `tar -C
[--strip-components]`, and similar.
* We like to keep things as minimal as possible in terms of shell
scripting, but try to follow the standard installation procedure
where reasonable.
### Language configuration
After installing the language, you'll need to configure it. This is
done by adding an entry to
[`backend/src/langs.ts`](backend/src/langs.ts).
#### Required keys
Here is an example of a minimal language configuration, with only the
required keys:
```ts
befunge: {
name: "Befunge",
main: "main.be",
run: "befunge-repl main.be",
template: `64+"!dlrow ,olleH">:#,_@
`,
},
```
We have five things here:
* The language ID `befunge`, which appears in the URL
(<https://riju.codes/befunge>) and is used internally to track the
language. This can contain Unicode characters, but it must be safe
for URLs, so for example `+` is fine but `#` is not.
* The language display name `Befunge`, which is shown on the language
homepage and in some UI messages. The homepage is sorted by this
key.
* The name `main.be` of the file where a program is stored to be run.
When the user clicks the Run button, whatever is in the code editor
will be saved to this filename before the run command is executed.
We try to use a file extension that is actually appropriate for the
language, but if there's no standard one it's okay to pick something
that seems reasonable. The "Hello, world" resources on the [Riju
wiki](https://github.com/raxod502/riju/wiki) can be a helpful source
of plausible file extensions for obscure languages.
* The shell command `befunge-repl main.be` (executed in Bash) to run
the main file.
* The code `64+"!dlrow ,olleH">:#,_@` to prepopulate in the code
editor on page load. This should print "Hello, world!" exactly with
a trailing newline. Some languages are extremely difficult to write
in, so for those cases it's okay to copy a "Hello, world" program
from online even if it doesn't have the formatting quite right. The
`template` key should use a backtick-delimited string literal with a
trailing newline.
After you add these four keys, you should be able to test your
language at <http://localhost:6119/yourlanguage>. You shouldn't have
to manually restart anything if you're running `yarn dev` already.
#### Interactive languages
Some languages provide an interactive REPL facility. Such languages
should be configured to expose that functionality on Riju. That's done
by adding a `repl` key, like so:
```ts
python: {
name: "Python",
repl: "python3 -u",
main: "main.py",
run: "python3 -u -i main.py",
template: `print("Hello, world!")
`,
},
```
The `repl` key has another shell command to be executed with Bash,
which should launch a REPL independently of whatever may be in the
main file (in this case `main.py`). Also, for interactive languages,
the `run` command is different. It should not only run the code in
`main.py`, but then subsequently start the REPL. Many languages
provide a way to do this conveniently; for Python, the `-i` flag
forces the REPL to launch after `main.py` has finished executing.
One thing you will want to test is whether variables from your code
are accessible in the REPL. If it's possible to make this happen, do
it. Some languages require that you use a specific command-line
argument to get this behavior, while others may not support it at all.
For languages that do not have a convenient `-i` flag like Python, it
is okay to explicitly launch the REPL after running the code:
```ts
kitten: {
name: "Kitten",
repl: "kitten",
main: "main.ktn",
run: "kitten main.ktn; kitten",
template: `"Hello, world!" say
`,
},
```
##### Hacking startup files
Many languages *appear* to have no way to start a REPL with your
code's variables in scope, but nevertheless have a secret backdoor.
Check to see if the language supports a "startup file" or "profile
file" or "REPL configuration file" or "rc file" or anything like that.
If it does, we can often put the user's code in there and it will be
read in a special way when the REPL starts up. This is especially
useful for shells:
```ts
zsh: {
name: "Zsh",
repl: "SHELL=/usr/bin/zsh zsh",
main: ".zshrc",
createEmpty: ``,
run: `SHELL=/usr/bin/zsh zsh`,
template: `echo "Hello, world!"
`,
},
```
When this hack is used, the `repl` and `run` commands are often the
same, with the different behavior of running the user's code or not
being caused by whether or not the code has been put into the profile
file (in this case `.zshrc`).
Note the use of the `createEmpty` key here. To make LSP work properly
(more on that later), the main file (here `.zshrc`) is actually
created even before the user clicks the Run button. This causes
problems for languages that use a startup file hack, since when
executing the `repl` shell command, the "Hello, world" code from the
template will get executed, which is undesired. To work around this,
the `createEmpty` key allows you to provide a special value to write
into the main file (here `.zshrc`) before executing the `repl` shell
command. Providing the empty string ensures no user code gets executed
when we just want to launch a REPL.
Check out Beanshell, Elvish, Factor, GEL, Ksh, R, SageMath, Sh, Tcl,
Tcsh, and Zsh for examples of this hack.
#### Compiled languages
For languages that have a compilation step, it's nice to split out
that step into a separate shell command under the `compile` key, like
so:
```ts
c: {
name: "C",
main: "main.c",
compile: "clang -Wall -Wextra main.c -o main",
run: "./main",
template: `#include <stdio.h>
int main() {
printf("Hello, world!\\n");
return 0;
}
`,
},
```
It's not treated any differently by Riju at present than just cramming
both steps into the `run` key with a `&&`, but we could implement
optimizations later such as only re-running the compile step if the
code actually changed.
#### Integration tests
Riju has a suite of over 500 integration tests covering all supported
languages. This is to ensure that our aggressive rolling-release
policy does not lead to breakage.
My design philosophy of tests is that if they require any effort to
write, nobody is going to write them, least of all me. For this
reason, Riju uses a
[convention-over-configuration](https://en.wikipedia.org/wiki/Convention_over_configuration)
scheme to automatically synthesize the majority of the integration
tests without the need for any code to be written.
There are seven possible tests each language can have, and each
language has some subset of them:
* `run`: Verify that running the language with the default code causes
`Hello, world!` to be printed.
* `repl`: Verify that typing in an expression at the REPL (by default
`123 * 234`) causes a result to be output (by default `28782`).
* `runrepl`: Same as `repl`, but after the Run button is clicked. This
proves that a REPL is being started after the code finishes running.
* `scope`: Verify that a variable defined in the code is accessible in
the REPL after the code is run.
* `format`: Verify that a code formatter correctly reformats a piece
of sample code.
* `lsp`: Verify that an LSP server produces a particular completion.
* `ensure`: Run an arbitrary shell command and verify that it exits
successfully. This test is currently not used by any language.
The `run`, `repl`, and `runrepl` tests are configured automatically
with default values for every language. The other test types are not
configured automatically, because there is no way to pick a reasonable
default for the behavior they are testing. Use the following list to
identify which tests you should make sure are configured for your
language:
* `run`: all languages
* `repl`, `runrepl`: interactive languages (ones with a `repl` key)
* `scope`: interactive languages where you can access code variables
from the REPL
* `format`: languages with code formatters (see below)
* `lsp`: languages with LSP support (see below)
##### Test configuration
FIXME
## Debugging tools
Add `#debug` to the end of a Riju URL and reload the page to output
all messages in JSON format in the JavaScript console. You can copy
the LSP messages as JSON for direct use in the LSP REPL (see below).
To get a sandboxed shell session, the same as is used to run languages
on Riju, run:
$ yarn sandbox
To start up a JSON REPL for interacting with LSP servers, run:
$ yarn lsp-repl (LANGUAGE | CMD...)
## Self-hosting
Riju is hosted on [DigitalOcean](https://www.digitalocean.com/). Sign
up for an account and obtain a personal access token with read/write
access.
You will need some credentials. Start by selecting an admin password
to use for the DigitalOcean instance. Then generate two SSH key-pairs
(or you can use pre-existing ones). One is for the admin account on
DigitalOcean, while the other is to deploy from CI.
Install [Packer](https://www.packer.io/). Riju uses Packer to generate
DigitalOcean AMIs to ensure a consistent setup for the production
instance. Navigate to the `packer` subdirectory of this repository and
create a file `secrets.json`, changing the values as appropriate for
your setup:
```json
{
"digitalocean_api_token": "28114a9f0ed5637c576794138c71bf03d01946288a6922ea083f923ec883c431",
"admin_password": "R3iIhqs856N1sT5Mg6QFAsB5VPJrXS",
"admin_ssh_public_key_file": "/home/raxod502/.ssh/id_rsa.pub",
"deploy_ssh_public_key_file": "/home/raxod502/.ssh/id_rsa_riju_deploy.pub"
}
```
We'll start by setting up Riju without TLS. Run:
$ packer build -var-file secrets.json config.json
This will take about five minutes to generate a DigitalOcean AMI. Log
in to your DigitalOcean and launch an instance based on that AMI
(called an "Image" in the interface). The hosted version of Riju uses
the $10/month instance with 1 vCPU and 2GB memory / 50GB disk.
Root login is disabled on the AMI generated by Packer, but
DigitalOcean unfortunately doesn't give you any option to leave login
settings unchanged. I suggest setting the root password to a random
string. Make a note of the IP address of the droplet and SSH into it
under the admin user, using the key that you specified in
`secrets.json`. Now perform the following setup:
$ sudo passwd -l root
This completes the first DigitalOcean portion of deployment.
Now you'll need an account on [Docker Hub](https://hub.docker.com/),
which is where built images will be stored before they are pulled down
to DigitalOcean. Create a repository; the name will be
`your-docker-id/whatever-you-name-the-repo`. You'll need this below.
You're now ready to deploy. You can do this manually to begin with. In
the repository root on your local checkout of Riju, create a file
`.env`, changing the values as appropriate for your setup:
DOCKER_REPO=raxod502/riju
DOMAIN=riju.codes
DEPLOY_SSH_PRIVATE_KEY=/home/raxod502/.ssh/id_rsa_riju_deploy
Run:
$ docker login
$ make deploy
Riju should now be available online at your instance's public IP
address.
Next, let's configure TLS. You'll need to configure DNS for your
domain with a CNAME to point at your DigitalOcean instance. Once DNS
has propagated, SSH into your DigitalOcean instance and run:
$ sudo systemctl stop riju
$ sudo certbot certonly --standalone
$ sudo systemctl start riju
You'll also want to set up automatic renewal. This can be done by
installing the two Certbot hook scripts from Riju in the
`packer/resources` subdirectory. Here is one approach:
$ sudo wget https://github.com/raxod502/riju/raw/master/packer/resources/certbot-pre.bash
-O /etc/letsencrypt/renewal-hooks/pre/riju
$ sudo wget https://github.com/raxod502/riju/raw/master/packer/resources/certbot-post.bash
-O /etc/letsencrypt/renewal-hooks/post/riju
$ sudo chmod +x /etc/letsencrypt/renewal-hooks/pre/riju
$ sudo chmod +x /etc/letsencrypt/renewal-hooks/post/riju
At this point you should be able to visit Riju at your custom browser
with TLS enabled.
We can now set up CI. Sign up at [CircleCI](https://circleci.com/) and
enable automatic builds for your fork of Riju. You'll need to set the
following environment variables for the Riju project on CircleCI,
adjusting as appropriate for your own setup:
DOCKER_USERNAME=raxod502
DOCKER_PASSWORD=MIMvzS1bKPunDDSX4AJu
DOCKER_REPO=raxod502/riju
DOMAIN=riju.codes
DEPLOY_SSH_PRIVATE_KEY=b2Rs......lots more......SFAK
To obtain the base64-encoded deploy key, run:
$ cat ~/.ssh/id_rsa_riju_deploy | base64 | tr -d '\n'; echo
New pushes to master should trigger deploys, while pushes to other
branches should trigger just builds.