Let's gooooo

This commit is contained in:
Radon Rosborough 2021-07-12 06:09:45 +00:00
parent 256d5d1f2b
commit 2b9da7af4b
10 changed files with 338 additions and 140 deletions

View File

@ -3,7 +3,8 @@
* [Criteria for language inclusion](doc/what-languages.md)
* [How to add your own language to Riju](doc/tutorial.md)
* [Deep dive on Riju build system](doc/build.md)
* [Deploying your own instance of Riju](doc/infrastructure.md)
* [Riju infrastructure layout](doc/infra.md)
* [Deploying your own instance of Riju](doc/selfhosting.md)
If you'd like to request a new language, head to the [language support
meta-issue](https://github.com/raxod502/riju/issues/24) and add a

View File

@ -24,6 +24,8 @@ endif
# Get rid of 'Entering directory' / 'Leaving directory' messages.
MAKE_QUIETLY := MAKELEVEL= make
REQUIRE_PACKAGING := @if [[ $${HOSTNAME} != packaging ]]; then echo >&2 "packages should be built in packaging container"; exit 1; fi
.PHONY: all $(MAKECMDGOALS) frontend system supervisor
all: help
@ -42,8 +44,7 @@ else
IT_ARG :=
endif
## Pass NC=1 to disable the Docker cache. Base images are not pulled;
## see 'make pull-base' for that.
## Pass NC=1 to disable the Docker cache.
image: # I=<image> [L=<lang>] [NC=1] : Build a Docker image
@: $${I}
@ -83,7 +84,7 @@ endif
IMAGE_HASH := "$$(docker inspect riju:$(LANG_TAG) | jq '.[0].Config.Labels["riju.image-hash"]' -r)"
WITH_IMAGE_HASH := -e RIJU_IMAGE_HASH=$(IMAGE_HASH)
shell: # I=<shell> [L=<lang>] [E[E]=1] [P1|P2=<port>] : Launch Docker image with shell
shell: # I=<shell> [L=<lang>] [E[E]=1] [P1|P2=<port>] [CMD="<arg>..."] : Launch Docker image with shell
@: $${I}
ifneq (,$(filter $(I),admin ci))
@mkdir -p $(HOME)/.aws $(HOME)/.docker $(HOME)/.ssh $(HOME)/.terraform.d
@ -118,15 +119,18 @@ all-scripts: # Generate packaging scripts for all languages
pkg-clean: # L=<lang> T=<type> : Set up fresh packaging environment
@: $${L} $${T}
$(REQUIRE_PACKAGING)
sudo rm -rf $(BUILD)/src $(BUILD)/pkg
mkdir -p $(BUILD)/src $(BUILD)/pkg
pkg-build: # L=<lang> T=<type> : Run packaging script in packaging environment
@: $${L} $${T}
$(REQUIRE_PACKAGING)
cd $(BUILD)/src && pkg="$(PWD)/$(BUILD)/pkg" src="$(PWD)/$(BUILD)/src" $(or $(BASH_CMD),../build.bash)
pkg-debug: # L=<lang> T=<type> : Launch shell in packaging environment
@: $${L} $${T}
$(REQUIRE_PACKAGING)
$(MAKE_QUIETLY) pkg-build L=$(L) T=$(T) CMD=bash
Z ?= none
@ -137,6 +141,7 @@ Z ?= none
pkg-deb: # L=<lang> T=<type> [Z=gzip|xz] : Build .deb from packaging environment
@: $${L} $${T}
$(REQUIRE_PACKAGING)
fakeroot dpkg-deb --build -Z$(Z) $(BUILD)/pkg $(BUILD)/$(DEB)
## This is equivalent to the sequence 'pkg-clean', 'pkg-build', 'pkg-deb'.
@ -197,7 +202,7 @@ sandbox: # L=<lang> : Run isolated shell with per-language setup
## directly in the current working directory.
lsp: # L=<lang|cmd> : Run LSP REPL for language or custom command line
@: $${C}
@: $${L}
node backend/lsp-repl.js $(L)
### Fetch artifacts from registries

View File

@ -36,6 +36,16 @@ guarantees about the security or privacy of your data.
See also [Reporting a security issue](SECURITY.md).
## Are there rules?
Yes, there is one rule and it is "please be nice". Examples of not
being nice include:
* *Trying to consume as many resources as possible.* All this will do
is prevent others from using Riju, which isn't nice.
* *Mining cryptocurrency.* Since hosting Riju comes out of my
paycheck, this is exactly equivalent to stealing, which isn't nice.
## Can I help?
Absolutely, please see [Contributing guide](CONTRIBUTING.md).

View File

@ -18,7 +18,69 @@ that they are managed directly by Makefile.)
### Available build artifacts
* `image:ubuntu`: A fixed revision of the upstream `ubuntu:rolling`
image that is used as a base for all Riju images.
* `image:packaging`: Provides an environment to build Debian packages.
Depends on `image:ubuntu`.
* `image:runtime`: Provides an environment to run the Riju server and
the test suite. Depends on `image:ubuntu`.
* `image:base`: Provides a base image upon which per-language images
can be derived. Depends on `image:ubuntu`.
* `deb:lang-xyz`: For each language `xyz`, the Debian package that
installs that language into the base image. Depends on
`image:packaging` and the build script for the language (generated
from the `install` clause of `langs/xyz.yaml`).
* `deb:shared-pqr`: Same but for shared dependencies, which are also
archived as Debian packages.
* `image:lang-xyz`: For each language `xyz`, the per-language image
used for user sessions in that language. Depends on `image:base`,
`deb:lang-xyz`, and possibly one or more `deb:shared-pqr`.
* `test:lang-xyz`: An artifact certifying that the `xyz` language
tests passed. Depends on `image:runtime`, `image:lang-xyz`, the test
suite and API protocol code, and the `xyz` language configuration.
* `image:app`: Built on top of `image:runtime` but including the Riju
server code, so that it can be run standalone. Depends on
`image:runtime` and the application code.
* `deploy:ready`: Deployment configuration, ready to upload. Depends
on `image:app` and all `test:lang-xyz`.
* `deploy:live`: Pseudo-artifact corresponding to actually running the
deployment, which is a blue/green cutover in which all languages and
the application server are updated at once.
### Depgraph abstractions
Each artifact has:
* A list of zero or more dependencies on other artifacts.
* A recipe to build it locally assuming that its dependencies are also
available locally.
* A recipe to upload the local build artifact to a remote registry.
* A recipe to download the artifact from a remote registry to
overwrite the local version.
* A way to compute a content-based hash of the artifact's dependencies
and non-artifact inputs (data files and code). Crucially this does
not require the dependencies to actually be built (the hash for each
artifact is based only on its dependencies *hashes*), so it's
possible to compute hashes for the entire dependency tree before
doing anything.
* A way to check the hash currently attached to a local artifact,
which can be compared to the desired hash to see if it needs to be
rebuilt.
* A way to check the hash currently attached to an artifact in a
remote registry, which can also be compared to the desired hash.
There are special types of artifacts:
* *Manual* artifacts do not have a hash until they are generated.
Therefore, they must be built manually before the rest of the
dependency calculations can proceed. `image:ubuntu` is a manual
artifact since its hash depends on what we download from
`ubuntu:rolling`.
* *Publish* artifacts do not have a hash after they are generated.
Therefore, nothing can declare a dependency on them. `deploy:live`
is a publish artifact. (Actually `deploy:ready` is a publish
artifact too, but that is an implementation detail because I was
lazy about my abstractions.)
### Usage of Depgraph
@ -37,177 +99,184 @@ Options:
-h, --help display help for command
```
To get a quick overview, run `make help`.
You can run `dep --list` to list all the available artifacts. Then
`dep name-of-artifact [names-of-more-artifacts...]` will generate
those artifacts. Depgraph is like Terraform in that it will compute a
plan and then ask you to confirm before proceeding.
## Build artifacts
By default Depgraph will generate artifacts locally only, although it
will download remote artifacts if appropriate versions exist in the
registry. Pass `--publish` to also cache generated artifacts in the
remote registries. Of course `--publish` is required to build
`deploy:live`.
We have two kinds of artifacts: Docker images (`I=` in the Makefile)
and Debian packages (`L=` and `T=` in the Makefile).
For dealing with `image:ubuntu` specifically, you probably just want
to fetch Riju's version (available in a public ECR repository) using
`make sync-ubuntu` to keep in sync. However if you do want to update
to the latest `ubuntu:rolling`, it's `dep image:ubuntu --manual`.
### Docker images
The other options (aside from `--yes`) are mostly not too useful.
Depgraph is very sophisticated and should always compute the minimum
necessary build plan based on any changes you have made. So, you don't
need to worry about the details! (Except when the hashing isn't
working properly. Then you cry.)
* `admin`: The first thing you build, and then everything else
(including building other Docker images) is done from inside.
* `ci`: Same as `admin` but for CI, so it has only the minimum number
of dependencies.
* `packaging`: Provides an environment to build Debian packages.
* `runtime`: Base runtime environment for Riju into which Debian
packages are installed and in which the server is expected to run.
* `composite`: Based on `runtime`, but with all languages' Debian
packages installed.
* `compile`: Compiles the Riju application code (i.e. everything
that's not per-language).
* `app`: Based on `composite`, but with compiled code copied over from
`compile`. This container serves traffic in production.
## Makefile
Docker images are built by running `make image I=<image> [NC=1]`, and
run by `make shell I=<image> [E=1]`.
To get a "quick" overview, run `make help`.
Riju source code and build directories are typically mounted at `/src`
inside the container, so there is generally no need to rebuild and/or
restart containers when making changes. (Exception: `compile` and
`app`.)
### Preliminary targets
* `NC=1`: pass `--no-cache` to `docker build`. Note that caching is
always disabled for `composite` due to the unique way in which the
build process is implemented for that image (to ensure good
performance).
* `E=1`: map ports to the host. Generally desired for `runtime`, not
needed for `admin`.
There are a couple of targets that are independent of Depgraph and
need to be run just to make sure various bits of state and generated
files are up to date. Depgraph and/or `ci-run.bash` take care of this.
Note that `admin` uses `--network=host` and maps a number of
* `make ecr`: Authenticate to ECR, needed to push and pull. The
authentication only lasts for 12 hours unfortunately, although it
does survive an admin shell restart.
* `make all-scripts`: Generate packaging scripts (`build.bash` and
`install.bash` in `build/{lang,shared}`) from YAML configuration.
* `make system`: Compile setuid binary used for spinning up and
tearing down user containers. This is needed early because we use
real containers in the test suite.
### Building Depgraph artifacts
First let's go through each of the Depgraph-enabled artifacts above.
For each one, there's:
* a way to build it locally
* a way to publish the local version to a remote registry
* a way to download the remote version locally
#### Docker images
Generally you build a Docker image named `image:foobar` using `make
image I=foobar`, you publish it with `make push I=foobar`, and you
download it with `make pull I=foobar`. Pass `NC=1` to `make image` to
disable Docker cache (although this is fairly rarely useful, and in
general for this to work with Depgraph we need a more sophisticated
mechanism).
There are one or two exceptions to this, unfortunately:
* For language images (`image:lang-foobar`), it's `make image I=lang
L=foobar`.
* For `image:ubuntu`, you likely don't want to "build" it yourself
(meaning take the latest `ubuntu:rolling` from Docker Hub). You can
synchronize with upstream Riju using `make sync-ubuntu`.
For any Docker image `image:foobar`, you can jump into a shell using
`make shell I=foobar`. This has some optional arguments:
* `E=1`: Expose Riju ports outside the container. Most likely used as
`make shell I=runtime E=1` inside the admin shell.
* `EE=1`: Same as `E=1`, but expose ports on `0.0.0.0` outside the
container. This is helpful if you're running on the dev server and
want to be able to access the development version of Riju in your
browser.
* `CMD="make something"`: Instead of launching an interactive Bash
shell inside the container, run the specified shell command (using
Bash) and exit.
Riju source code and build directories are typically cross-mounted at
`/src` inside all non-user containers, so there is generally no need
to rebuild and/or restart containers when making changes.
Note that all of this section applies also to the `admin` and `ci`
images, which are not otherwise involved with Depgraph (and are based
directly on upstream `ubuntu:rolling`).
Note also that `admin` uses `--network=host` and maps a number of
directories such as `~/.ssh` and `~/.aws`, plus the Docker socket,
inside the container, so you can treat an admin shell more or less the
same as your external development environment.
Note also that Docker builds do not pull new base images. For that,
use `make pull-base`.
#### Debian packages
### Debian packages
Build a language package using `make pkg T=lang L=xyz` (where there
exists `langs/xyz.yaml`). Build a shared dependency package using
`make pkg T=shared L=pqr` (where there exists `shared/pqr.yaml`).
There are three types of Debian packages:
This has to be done in the packaging image, and will abort otherwise.
So, for short, `make shell I=packaging CMD="make pkg T=lang L=xyz"`.
* `lang`, e.g. `riju-lang-python` (`T=lang L=python`): Installs the
actual language and any associated tools. May declare dependencies
on other Ubuntu packages, and may include files directly.
* `config`, e.g. `riju-config-python` (`T=config L=python`): Installs
a JSON configuration file into `/opt/riju/langs`. The server looks
in this directory to find which languages are supported.
* `shared`, e.g. `riju-shared-pandoc` (`T=shared L=pandoc`): Shared
dependency. This is for when multiple different languages need the
same tool, and there's no Ubuntu package for it.
To debug package installation, you can jump into a persistent
packaging shell (`make shell I=packaging`) and break down the process
into three steps:
There are three basic actions for any particular Debian package:
* `make pkg-clean T=lang L=xyz`: Delete and recreate packaging
directories for `xyz`.
* `make pkg-build T=lang L=xyz`: Run the packaging script. Substitute
`build` for `debug` to instead start a shell in the environment of
the packaging script, where you can operate manually.
* `make pkg-deb T=lang L=xyz`: Compress the results of the packaging
script into a file `build/lang/xyz/riju-lang-xyz.deb`. You can pass
`Z=(gzip|xz)` to enable compression, which is disabled by default to
save on time during development. Otherwise, packages are
automatically recompressed before registry upload time.
* From any container, run `make script L=<lang> T=<type>` to generate
the build script for a package. This is placed in
`build/<type>/<lang>/build.bash`.
* From a packaging container, run `make pkg L=<lang> T=<type>` to
(re)build a Debian package by executing its build script in a fresh
directory. This is placed in
`build/<type>/<lang>/riju-<type>-<lang>.deb`.
* From a runtime container, run `make install L=<lang> T=<type>` to
install it.
Uploading a package to the registry is `make upload T=lang L=xyz`, and
download is `make download T=lang L=xyz`.
Each language consists of a `lang` and `config` package, so you need
to follow the above steps for both. The `make scripts L=<lang>`, `make
pkgs L=<lang>`, and `make installs L=<lang>` commands automate this.
#### Tests
For further convenience, if you already have a runtime container up,
from the admin shell you can use `make repkg L=<lang> T=<type>` and/or
`make repkgs L=<lang>` to automate the three steps above (run `make
script`, run `make pkg` inside a fresh packaging container, and then
run `make install` inside the existing runtime container).
You can run tests for a specific language (inside the `runtime` image
only, otherwise it will abort) using `make test L=xyz`. `L` can also
be a comma-separated list of languages. You can additionally (or
instead) filter by test type, e.g. `make test L=python T=lsp`.
Uploading and downloading test hashes is only implemented at the
Depgraph layer.
Some `lang` packages declare `shared` dependencies, in which case they
won't install until the `shared` package is built and installed
already. This can't be done with `make scripts`, `make pkgs`, `make
installs`, or `make repkgs`: use `make script T=shared L=<lang>`,
`make pkg T=shared L=<lang>`, `make install T=shared L=<lang>`, or
`make repkg T=shared L=<lang>`, respectively. (Check the
`install.riju` key in a language's YAML configuration to see if it
declares any such dependencies.)
#### Final deployment
#### Package build details
* `make deploy-config`: Build the deployment configuration JSON that
will be pushed to S3.
* `make deploy-latest`: Push it to S3.
* `make deploy`: Combination of the above.
The build script is executed with a working directory of
`build/<type>/<lang>/src`, and it installs package files into
`build/<type>/<lang>/pkg`.
### Application build
If `make pkg` is too high-level, there are more specific commands:
* `make pkg-clean`: Wipe and recreate the `src` and `pkg` directories.
* `make pkg-build`: Just run the package build script (you also need
to run `make script` if the language configuration has changed).
* `make pkg-deb`: Build the `pkg` directory into the actual Debian
package.
All Makefile targets with `pkg` in the name take an optional `Z`
parameter for the `.deb` compression level, defaulting to `none`. This
can be increased to `gzip` or even further to `xz`. Increasing the
compression massively increases build time, but massively decreases
the resulting package size.
## Artifact caching
All artifacts can be cached on remote registries to avoid being
rebuilt in CI unnecessarily.
* Docker images are cached on Docker Hub. Push with `make push
I=<image>` and pull with `make pull I=<image>`.
* Debian packages are cached on S3. Push with `make upload T=<type>
L=<lang>` and pull with `make download T=<type> L=<lang>`.
CI will take care of managing the remote registries automatically. It
is generally recommended to let CI handle this, and not push anything
yourself.
## Application build
We have two compiled parts of Riju:
We have three compiled parts of Riju:
* Frontend assets (compiled with Webpack)
* Setuid binary used for privilege deescalation (compiled with LLVM)
* Supervisor binary used on deployed images (compiled with Go tooling)
For development:
* `make frontend-dev` (compile frontend assets, auto recompile on
change)
* `make system-dev` (compile setuid binary, auto recompile on change)
* `make supervisor-dev` (compile supervisor binary, auto recompile on
change)
* `make server-dev` (run server, auto restart on change)
* `make dev` (all three of the above)
* `make dev` (all four of the above)
For production:
* `make frontend` (compile frontend assets)
* `make system` (compile setuid binary)
* `make build` (both of the above)
* `make supervisor` (compile supervisor binary)
* `make build` (all three of the above)
* `make server` (run server)
## Incremental builds and hashing
### Miscellaneous utilities
CI is set up so that artifacts are only rebuilt when changes have
occurred. This is done through an extensive hashing algorithm which
produces a consistent hash for each artifact based on its inputs. We
can then check whether the hash has changed, meaning the artifact
should be rebuilt.
* `make sandbox`: Bash shell emulating a user session at the command
line, with many useful functions in scope for executing various
commands from the language configuration YAML.
* `make lsp`: LSP REPL. This is not working currently as it needs to
be updated for the new build system that uses per-language images.
* `make dockerignore`: Update `.dockerignore` from `.gitignore`.
* `make tmux`: Start a tmux session conveniently with variables from
`.env`.
* `make env`: Load in `.env` environment variables in a subshell. You
can also do this for your current session (though it won't affect
new tmux panes) with `set -a; . .env; set +a`.
This is implemented mostly behind the scenes, but you can run `make
plan` to execute the hashing algorithm and dump a plan of the minimal
set of actions that would be run if this were CI:
### Infrastructure
* If local artifact is missing, but remote artifact is up to date:
download remote to local.
* If remote artifact is missing or outdated, but local artifact is up
to date: upload local to remote.
* If neither local nor remote artifact is up to date: rebuild local
and upload to remote.
You can run `make sync` to execute this plan, excepting the upload
part (that should, for safety, generally be done only in CI). So, in
principle, `make sync` should bring all your local artifacts up to
date with the latest source (rebuilding some if needed).
To run a full deployment, use `make publish`. This should definitely
be done only from CI, and with the `Z=xz` flag to enable Debian
package compression.
There are wrappers for `packer` and `terraform` in the repository
`bin` that deal with some niceties automatically. Run `make packer` to
do an AMI build, and use `terraform` commands from any directory.

36
doc/infra.md Normal file
View File

@ -0,0 +1,36 @@
# Riju infrastructure
This document has a brief description of how the deployed Riju
infrastructure is set up. It's *super* rough.
When you visit <https://riju.codes>, traffic is routed by CloudFlare
proxy (DNS hosted on Namecheap) to an AWS ALB (with TLS cert provided
by ACM) pointed at an EC2 ASG. Each EC2 node in the ASG (for now) has
its own EBS volume used for Docker data.
The nodes each run a supervisor binary written in Go, which
orchestrates Docker and systemd to run the Riju server container and
proxy traffic internally to the server to handle blue/green cutovers.
Deployment of the supervisor binary is via AMI, and other deployment
configuration is done by a JSON file in S3 that the supervisor binary
polls for.
Note: in the future, we will probably use EBS multi-attach and a
separate supervisor node that uses the EC2 API to manage the
attachments and farm out configuration updates to the server nodes.
This should help cut costs.
Docker images (for both the server and individual languages) are
hosted on AWS ECR, and other intermediate build artifacts are hosted
on S3.
Inside the Riju server container itself, which exposes HTTP traffic to
the internal supervisor proxy, we have an Express server that receives
websocket API messages and translates them into invocations of a C
setuid binary that is used to interface with the Docker daemon in a
safe way and spin up user containers with appropriate resource
restrictions.
Observability is mostly limited, but we have some CloudWatch
dashboards and alarms set up, with an SNS topic that goes to
PagerDuty.

View File

@ -1,4 +1,4 @@
# Riju infrastructure
# How to self-host Riju
You can host your own instance of Riju! This requires a bit of manual
setup, but everything that *can* be automated, *has* been automated.

View File

@ -1 +1,28 @@
TODO
# Tutorial: add a code formatter
Not all languages have code formatters, but if they exist, we like to
add them. You'll need to update the `install` recipe in your
language's configuration to install the code formatter as well. Then
add a `format.run` key with a shell command that will read a program
on stdin and write the formatted version to stdout.
You'll also want to add a `format.input` key which is equivalent to
the `template` code, but formatted incorrectly. This can be used to
verify that the formatter is working as expected.
Here's an example:
```yaml
install:
apt:
- black
template: |
print("Hello, world!")
format:
run: |
black -
input: |
print('Hello, world!')
```

View File

@ -1 +1,42 @@
TODO
# Tutorial: add a language server
Language servers provide autocompletion and other handy features. They
are often extremely fiddly to get working, unfortunately.
Here's an example of the best-case scenario, where you can just
install a language server and it works out of the box:
```yaml
install:
npm:
- vim-language-server
lsp:
start: |
vim-language-server --stdio
code: "TODO"
item: "TODO"
```
Unfortunately it's usually not quite so easy, which is why we have
various configuration options (check existing languages for usage
examples):
* `setup`: Shell command to run to set up language server caches or
whatever. This happens before `start`, once.
* `disableDynamicRegistration`: By default language server client
"features" are registered one at a time with the server. Some
servers are buggy and don't support the protocol correctly, which
means setting this key to true may fix the problem.
* `init`, `config`: Two different ways of sending an arbitrary
configuration blob to the language server. Sometimes a language
server will need one or the other of them to be set to some
particular value "because that's what VSCode does", or it won't work
properly.
* `lang`: For some reason the client sends its impression of what
language the current file is in to the server. This really shouldn't
make a difference, but sometimes servers will barf if the magic
string isn't quite right. In that case you can override it with the
`lang` key.
* `code`, `after`, `item`: These are used in the test suite (see
later).

View File

@ -1 +1,9 @@
TODO
# Tutorial: providing metadata
Recently I've been trying to add semantic metadata to each language,
to be used for categorization and search of the ever-growing number of
languages.
I promise I will write detailed documentation on the format of this
metadata "soon", but for now please see the JSON Schema and examples
in some existing languages!

View File

@ -110,6 +110,7 @@ void session(char *uuid, char *lang, char *imageHash)
"--cpus", "1",
"--memory", "1g",
"--memory-swap", "3g",
"--pids-limit", "512",
image, "bash", "-c",
"cat /var/run/riju/sentinel/fifo | ( sleep 10; while read -t2; do :; done; pkill -g0 )",
NULL,