zulip/docs/life-of-a-request.md

10 KiB

Life of a Request

It can sometimes be confusing to figure out how to write a new feature, or debug an existing one. Let us try to follow a request through the Zulip codebase, and dive deep into how each part works.

We will use as our example the creation of users through the API, but we will also highlight how alternative requests are handled.

A request is sent to the server, and handled by Nginx

When Zulip is deployed in production, all requests go through nginx. For the most part we don't need to know how this works, except for when it isn't working. Nginx does the first level of routing--deciding which application will serve the request (or deciding to serve the request itself for static content).

In development, tools/run-dev.py fills the role of nginx. Static files are in your git checkout under static, and are served unminified.

Nginx secures traffic with SSL

If you visit your Zulip server in your browser and discover that your traffic isn't being properly encrypted, an nginx misconfiguration is the likely culprit.

Static files are served directly by Nginx

Static files include javascript, css, static assets (like emoji, avatars), and user uploads (if stored locally and not on S3).

location /static/ {
    alias /home/zulip/prod-static/;
    error_page 404 /static/html/404.html;
}

Nginx routes other requests between tornado and django

All our connected clients hold open long-polling connections so that they can recieve events (messages, presence notifications, and so on) in real-time. Events are served by Zulip's tornado application.

Nearly every other kind of request is served by the zerver Django application.

Here is the relevant nginx routing configuration.

Django routes the request to a view in urls.py files

There are various urls.py files throughout the server codebase, which are covered in more detail in the directory structure doc.

The main Zulip Django app is zerver. The routes are found in

zproject/urls.py
zproject/legacy_urls.py

There are HTML-serving, REST API, legacy, and webhook url patterns. We will look at how each of these types of requests are handled, and focus on how the REST API handles our user creation example.

Views serving HTML are internationalized by server path

If we look in zproject/urls.py, we can see something called i18n_urls. These urls show up in the address bar of the browser, and serve HTML.

For example, the /hello page (preview here) gets translated in Chinese at zh-cn/hello/ (preview here).

Note the zh-cn prefix--that url pattern gets added by i18n_patterns.

API endpoints use REST

Our example is a REST API endpoint. It's a PUT to /users.

With the exception of Webhooks (which we do not usually control the format of), legacy endpoints, and logged-out endpoints, Zulip uses REST for its API. This means that we use:

  • POST for creating something new where we don't have a unique ID. Also used as a catch-all if no other verb is appropriate.
  • PUT for creating something for which we have a unique ID.
  • DELETE for deleting something
  • PATCH for updating or editing attributes of something.
  • GET to get something (read-only)
  • HEAD to check the existence of something to GET, without getting it; useful to check a link without downloading a potentially large link
  • OPTIONS (handled automatically, see more below)

Of these, PUT, DELETE, HEAD, OPTIONS, and GET are idempotent, which means that we can send the request multiple times and get the same state on the server. You might get a different response after the first request, as we like to give our clients an error so they know that no new change was made by the extra requests.

POST is not idempotent--if I send a message multiple times, Zulip will show my message multiple times. PATCH is special--it can be idempotent, and we like to write API endpoints in an idempotent fashion, as much as possible.

This cookbook and tutorial can be helpful if you are new to REST web applications.

PUT is only for creating new things

If you're used to using PUT to update or modify resources, you might find our convention a little strange.

We use PUT to create resources with unique identifiers, POST to create resources without unique identifiers (like sending a message with the same content multiple times), and PATCH to modify resources.

In our example, create_user_backend uses PUT, because there's a unique identifier, the user's email.

OPTIONS

The OPTIONS method will yield the allowed methods.

This request: OPTIONS https://zulip.tabbott.net/api/v1/users yields a response with this HTTP header: Allow: PUT, GET

We can see this reflected in zproject/urls.py:

url(r'^users$', 'zerver.lib.rest.rest_dispatch',
    {'GET': 'zerver.views.users.get_members_backend',
     'PUT': 'zerver.views.users.create_user_backend'}),

In this way, the API is partially self-documenting.

Legacy endpoints are used by the web client

The endpoints from the legacy JSON API are written without REST in mind. They are used extensively by the web client, and use POST.

You can see them in zproject/legacy_urls.py.

Webhook integrations may not be RESTful

Zulip endpoints that are called by other services for integrations have to conform to the service's request format. They are likely to use only POST.

Some integrations will only provide an API key for their webhooks. For these integrations, we use the api_key_only_webhook_view decorator, to fill in the user_profile and client fields of a request:

@api_key_only_webhook_view('PagerDuty')
@has_request_variables
def api_pagerduty_webhook(request, user_profile, client,
                          payload=REQ(argument_type='body'),
                          stream=REQ(default='pagerduty'),
                          topic=REQ(default=None)):

The client will be the result of get_client("ZulipPagerDutyWebhook") in this example.

Django calls rest_dispatch for REST endpoints, and authenticates

For requests that correspond to a REST url pattern, Zulip configures its url patterns (see zerver/lib/rest.py) so that the action called is rest_dispatch. This method will authenticate the user, either through a session token from a cookie, or from an email:api-key string given via HTTP Basic Auth for API clients.

It will then look up what HTTP verb was used (GET, POST, etc) to make the request, and then figure out which view to show from that.

In our example,

{'GET': 'zerver.views.users.get_members_backend',
 'PUT': 'zerver.views.users.create_user_backend'}

is supplied as an argument to rest_dispatch, along with the HTTPRequest. The request has the HTTP verb PUT, which rest_dispatch can use to find the correct view to show: zerver.views.users.create_user_backend.

The view will authorize the user, extract request variables, and validate them

There are some special decorators we may use for a given view. Our example uses require_realm_admin and has_request_variables:

@require_realm_admin
@has_request_variables
def create_user_backend(request, user_profile, email=REQ(), password=REQ(),
                        full_name=REQ(), short_name=REQ()):

require_realm_admin checks the authorization of the given user_profile to make sure it belongs to a realm admin and thus has permission to create a user.

We can see a special REQ() in the keyword arguments to create_user_backend. The body of a request is expected to be in JSON format, so this is used in conjunction with the has_request_variables decorator to unpack the variables in the JSON string for use in the function. The implementation of has_request_variables is documented heavily in zerver/lib/request.py)

REQ also helps us with request variable validation. For example: msg_ids = REQ(validator=check_list(check_int)) will check that the msg_ids request variable is a list of integers, marshalled as JSON.

See zerver/lib/validator.py for more validators and their documentation.

If the view does any modification to the database, that change is done in a helper function in zerver/lib/actions.py.

Results are given as JSON

Our API works on JSON requests and responses. Every API endpoint should return json_error in the case of an error, which gives a JSON string:

{'result': 'error', 'msg': <some error message>}

in a HTTP Response with a content type of 'application/json'.

To pass back data from the server to the calling client, in the event of a successfully handled request, we use json_success(data=<some python object which can be converted to a JSON string>.

This will result in a JSON string:

{'result': 'success', 'msg': '', 'data'='{'var_name1': 'var_value1', 'var_name2': 'var_value2'...}

with a HTTP 200 status and a content type of 'application/json'.

That's it!