From 8461048a115f6e5917ce837cdff0fd511804295c Mon Sep 17 00:00:00 2001 From: Yoyo Zhou Date: Mon, 17 Aug 2015 16:12:38 -0700 Subject: [PATCH] Add 'Code style', generated via pandoc -r mediawiki -w rst and html_unescape.py With a basic README.md (imported from commit 91728b2c591bc88b7bee520669cb0b3f53426cd8) --- docs/README.md | 17 ++ docs/code-style.rst | 481 ++++++++++++++++++++++++++++++++++++++++++++ docs/index.rst | 2 +- 3 files changed, 499 insertions(+), 1 deletion(-) create mode 100644 docs/README.md create mode 100644 docs/code-style.rst diff --git a/docs/README.md b/docs/README.md new file mode 100644 index 0000000000..4e67ca88cb --- /dev/null +++ b/docs/README.md @@ -0,0 +1,17 @@ +To create rST from MediaWiki input: + + * Use `pandoc -r mediawiki -w rst` on MediaWiki source. + * Use unescape.py to remove any leftover HTML entities (often inside
+     tags and the like).
+
+We can use pandoc to translate mediawiki into reStructuredText, but some things need fixing up:
+
+   * Add page titles.
+   * Review pages for formatting (especially inline code chunks) and content.
+   * Fix wiki links?
+   * Add pages to the table of contents (`index.rst`).
+
+To generate HTML docs locally from RST:
+
+   * `pip install sphinx`
+   * In this directory, `make html`. Output appears in a `_build/html` subdirectory.
diff --git a/docs/code-style.rst b/docs/code-style.rst
new file mode 100644
index 0000000000..b2f0190603
--- /dev/null
+++ b/docs/code-style.rst
@@ -0,0 +1,481 @@
+==========
+Code style
+==========
+
+.. attention::
+   Needs formatting and content review
+
+Be consistent!
+==============
+
+Look at the surrounding code, or a similar part of the project, and try
+to do the same thing. If you think the other code has actively bad
+style, fix it (in a separate commit). Talk to the author first if you
+can.
+
+If in doubt, ask your colleagues or look online for guidance from
+experts.
+
+Lint tools
+==========
+
+You can run them all at once with
+
+::
+
+    ./tools/lint-all
+
+You can set this up as a local Git commit hook with
+
+::
+
+    echo ./tools/lint-all > .git/hooks/pre-commit && chmod +x .git/hooks/pre-commit
+
+Javascript
+----------
+
+We use `JSLint `__ for checking Javascript code.
+You can simply run
+
+::
+
+    ./tools/jslint/check-all.js
+
+assuming that you have Node.js in your ``$PATH`` as either ``node`` or
+``nodejs``.
+
+We should try to keep the error count at zero, because having a bunch of
+"expected errors" will convince people to ignore the tool entirely.
+
+``tools/jslint/check-all.js`` contains a pretty fine-grained set of
+JSLint options, rule exceptions, and allowed global variables. If you
+add a new global, you'll need to add it to the list.
+
+Python
+------
+
+Run `Pyflakes `__ on your Python
+code.
+
+Secrets
+=======
+
+Please don't put any passwords, secret access keys, etc. inline in the
+code. These secrets should always live in either Puppet or (for our
+Django backend) in ``zproject/local_settings.py``. Otherwise, we will
+end up distributing these secrets to our local server customers.
+
+Dangerous constructs
+====================
+
+Misuse of database queries
+--------------------------
+
+Look out for Django code like this:
+
+| ``[Foo.objects.get(id=bar.x.id)``
+| ``for bar in Bar.objects.filter(...)``
+| ``if  bar.baz < 7]``
+
+This will make one database query for each ``Bar``, which is slow in
+production (but not in local testing!). Instead of a list comprehension,
+write a single query using Django's `QuerySet
+API `__.
+
+If you can't rewrite it as a single query, that's a sign that something
+is wrong with the database schema. So don't defer this optimization when
+performing schema changes, or else you may later find that it's
+impossible.
+
+UserProfile.objects.get() / Client.objects.get / etc.
+-----------------------------------------------------
+
+In our Django code, never do direct
+``UserProfile.objects.get(email=foo)`` database queries. Instead always
+use ``get_user_profile_by_{email,id}``. There are 3 reasons for this:
+
+#. It's guaranteed to correctly do a case-inexact lookup
+#. It fetches the user object from memcached, which is faster
+#. It always fetches a UserProfile object which has been queried using
+   .selected\_related(), and thus will perform well when one later
+   accesses related fields like the Realm.
+
+Similarly we have ``get_client`` and ``get_stream`` functions to fetch
+those commonly accessed objects via memcached.
+
+Using Django model objects as keys in sets/dicts
+------------------------------------------------
+
+Don't use Django model objects as keys in sets/dictionaries -- you will
+get unexpected behavior when dealing with objects obtained from
+different database queries:
+
+For example,
+``UserProfile.objects.only("id").get(id=17) in set([UserProfile.objects.get(id=17)])``
+is False
+
+You should work with the IDs instead.
+
+user\_profile.save()
+--------------------
+
+You should always pass the update\_fields keyword argument to .save()
+when modifying an existing Django model object. By default, .save() will
+overwrite every value in the column, which results in lots of race
+conditions where unrelated changes made by one thread can be
+accidentally overwritten by another thread that fetched its UserProfile
+object before the first thread wrote out its change.
+
+Using raw saves to update important model objects
+-------------------------------------------------
+
+In most cases, we already have a function in zephyr/lib/actions.py with
+a name like do\_activate\_user that will correctly handle lookups,
+caching, and notifying running browsers via the event system about your
+change. So please check whether such a function exists before writing
+new code to modify a model object, since your new code has a good chance
+of getting at least one of these things wrong.
+
+``x.attr('zid')`` vs. ``rows.id(x)``
+------------------------------------
+
+Our message row DOM elements have a custom attribute ``zid`` which
+contains the numerical message ID. **Don't access this directly as
+``x.attr('zid')``!** The result will be a string and comparisons (e.g.
+with ``<=``) will give the wrong result, occasionally, just enough to
+make a bug that's impossible to track down.
+
+You should instead use the ``id`` function from the ``rows`` module, as
+in ``rows.id(x)``. This returns a number. Even in cases where you do
+want a string, use the ``id`` function, as it will simplify future code
+changes. In most contexts in JavaScript where a string is needed, you
+can pass a number without any explicit conversion.
+
+Javascript var
+--------------
+
+Always declare Javascript variables using ``var``:
+
+``var x = ...;``
+
+In a function, ``var`` is necessary or else ``x`` will be a global
+variable. For variables declared at global scope, this has no effect,
+but we do it for consistency.
+
+Javascript has function scope only, not block scope. This means that a
+``var`` declaration inside a ``for`` or ``if`` acts the same as a
+``var`` declaration at the beginning of the surrounding ``function``. To
+avoid confusion, declare all variables at the top of a function.
+
+Javascript ``for (i in myArray)``
+---------------------------------
+
+Don't use it:
+`[1 `__],
+`[2 `__],
+`[3 `__]
+
+jQuery global state
+-------------------
+
+Don't mess with jQuery global state once the app has loaded. Code like
+this is very dangerous:
+
+| ``$.ajaxSetup({ async: false });``
+| ``$.get(...);``
+| ``$.ajaxSetup({ async: true });``
+
+jQuery and the browser are free to run other code while the request is
+pending, which could perform other Ajax requests with the altered
+settings.
+
+Instead, switch to the more general
+```$.ajax`` `__ function, which can
+take options like ``async``.
+
+State and logs files
+--------------------
+
+Do not write state and logs files inside the current working directory
+in the deployed environment. This will not work correctly, because the
+current working directory for the app changes every time we do a deploy.
+Instead, hardcode a path when settings.py -- see SERVER\_LOG\_PATH in
+settings.py for an example.
+
+JS array/object manipulation
+============================
+
+For generic functions that operate on arrays or JavaScript objects, you
+should generally use `Underscore `__. We used
+to use jQuery's utility functions, but the Underscore equivalents are
+more consistent, better-behaved and offer more choices.
+
+A quick conversion table:
+
+| ``   $.each → _.each (parameters to the callback reversed)``
+| ``   $.inArray → _.indexOf (parameters reversed)``
+| ``   $.grep → _.filter``
+| ``   $.map → _.map``
+| ``   $.extend → _.extend``
+
+There's a subtle difference in the case of ``_.extend``; it will replace
+attributes with undefined, whereas jQuery won't:
+
+| ``   $.extend({foo: 2}, {foo: undefined});  // yields {foo: 2}, BUT...``
+| ``   _.extend({foo: 2}, {foo: undefined});  // yields {foo: undefined}!``
+
+Also, ``_.each`` does not let you break out of the iteration early by
+returning false, the way jQuery's version does. If you're doing this,
+you probably want ``_.find``, ``_.every``, or ``_.any``, rather than
+'each'.
+
+Some Underscore functions have multiple names. You should always use the
+canonical name (given in large print in the Underscore documentation),
+with the exception of ``_.any``, which we prefer over the less clear
+'some'.
+
+More arbitrary style things
+===========================
+
+General
+-------
+
+Indentation is four space characters for Python, JS, CSS, and shell
+scripts. Indentation is two space characters for HTML templates.
+
+We never use tabs anywhere in source code we write, but we have some
+third-party files which contain tabs.
+
+Keep third-party static files under the directory
+``zephyr/static/third/``, with one subdirectory per third-party project.
+
+We don't have an absolute hard limit on line length, but we should avoid
+extremely long lines. A general guideline is: refactor stuff to get it
+under 85 characters, unless that makes the code a lot uglier, in which
+case it's fine to go up to 120 or so.
+
+Whitespace guidelines:
+
+-  Put one space (or more for alignment) around binary arithmetic and
+   equality operators.
+-  Put one space around each part of the ternary operator.
+-  Put one space between keywords like ``if`` and ``while`` and their
+   associated open paren.
+-  Put one space between the closing paren for ``if`` and ``while``-like
+   constructs and the opening curly brace. Put the curly brace on the
+   same line unless doing otherwise improves readability.
+-  Put no space before or after the open paren for function calls and no
+   space before the close paren for function calls.
+-  For the comma operator and colon operator in languages where it is
+   used for inline dictionaries, put no space before it and at least one
+   space after. Only use more than one space for alignment.
+
+Javascript
+----------
+
+Don't use ``==`` and ``!=`` because these operators perform type
+coercions, which can mask bugs. Always use ``===`` and ``!==``.
+
+End every statement with a semicolon.
+
+``if`` statements with no braces are allowed, if the body is simple and
+its extent is abundantly clear from context and formatting.
+
+Anonymous functions should have spaces before and after the argument
+list:
+
+``var x = function (foo, bar) { // ...``
+
+When calling a function with an anonymous function as an argument, use
+this style:
+
+| ``$.get('foo', function (data) {``
+| ``    var x = ...;``
+| ``    // ...``
+| ``});``
+
+The inner function body is indented one level from the outer function
+call. The closing brace for the inner function and the closing
+parenthesis for the outer call are together on the same line. This style
+isn't necessarily appropriate for calls with multiple anonymous
+functions or other arguments following them.
+
+Use
+
+``$(function () { ...``
+
+rather than
+
+``$(document).ready(function () { ...``
+
+and combine adjacent on-ready functions, if they are logically related.
+
+The best way to build complicated DOM elements is a Mustache template
+like ``zephyr/static/templates/message.handlebars``. For simpler things
+you can use jQuery DOM building APIs like so:
+
+``var new_tr = $('').attr('id', zephyr.id);``
+
+Passing a HTML string to jQuery is fine for simple hardcoded things:
+
+``foo.append('foo

');`` + +but avoid programmatically building complicated strings. + +We used to favor attaching behaviors in templates like so: + +:: + +

+ +but there are some reasons to prefer attaching events using jQuery code: + +- Potential huge performance gains by using delegated events where + possible +- When calling a function from an ``onclick`` attribute, ``this`` is + not bound to the element like you might think +- jQuery does event normalization + +Either way, avoid complicated JavaScript code inside HTML attributes; +call a helper function instead. + +HTML / CSS +---------- + +Don't use the ``style=`` attribute. Instead, define logical classes and +put your styles in ``zephyr.css``. + +Don't use the tag name in a selector unless you have to. In other words, +use ``.foo`` instead of ``span.foo``. We shouldn't have to care if the +tag type changes in the future. + +Don't use inline event handlers (``onclick=``, etc. attributes). +Instead, attach a jQuery event handler +(``$('#foo').on('click', function () {...})``) when the DOM is ready +(inside a ``$(function () {...})`` block). + +Use this format when you have the same block applying to multiple CSS +styles (separate lines for each selector): + +:: + + selector1, + selector2 { + }; + +Python +------ + +- Scripts should start with ``#!/usr/bin/env python`` and not + ``#!/usr/bin/python``. See commit ``437d4aee`` for an explanation of + why. Don't put such a line on a Python file unless it's meaningful to + run it as a script. (Some libraries can also be run as scripts, e.g. + to run a test suite.) +- The first import in a file should be + ``from __future__ import absolute_import``, per `PEP + 328 `__ +- Put all imports together at the top of the file, absent a compelling + reason to do otherwise. +- Unpacking sequences doesn't require list brackets: + +| ``[x, y] = xs    # unnecessary`` +| ``x, y = xs      # better`` + +- For string formatting, use ``x % (y,)`` rather than ``x % y``, to + avoid ambiguity if ``y`` happens to be a tuple. +- When selecting by id, don't use ``foo.pk`` when you mean ``foo.id``. + E.g. + ``recipient = Recipient(type_id=huddle.pk, type=Recipient.HUDDLE)`` + + should be written as + + ``recipient = Recipient(type_id=huddle.id, type=Recipient.HUDDLE)`` + + in case we ever change the primary keys. + +Version Control +=============== + +Commit Discipline +----------------- + +We follow the recommended git practice of "1 minimal coherent change per +commit". + +Coherency requirements for any commit: + +- It should pass tests (so test updates needed by a change should be in + the same commit as the original change, not a separate "fix the tests + that were broken by the last commit" commit). +- It should be safe to deploy individually, or comment in detail in the + commit message as to why it isn't (maybe with a [manual] tag). So + implementing a new API endpoint in one commit and then adding the + security checks in a future commit should be avoided -- the security + checks should be there from the beginning. +- Error handling should generally be included along with the code that + might trigger the error. +- TODO comments should probably be in the commit that introduces the + issue or functionality with further work required + +Exceptions: + +- If a commit is part of a long refactoring series, and the individual + commit is a step backward in code cleanliness, it's still preferred + to have it as a single commit, as long as it's a pure refactoring. + +When you should be minimal: + +- Significant refactorings should be done in a separate commit from + functional changes +- Moving code from one file to another should be done in a separate + commits from functional changes or even refactorings. +- 2 different refactorings should be done in different commits. +- 2 different features should be done in different commits. +- If you find yourself writing a commit message that reads like a list + of somewhat dissimilar things that you did, you probably should have + just done 2 commits. + +When not to be overly minimal: + +- For completely new features, you don't necessarily need to split out + new commits for each little subfeature of the new feature. E.g. if + you're writing a new tool from scratch, it's fine to have the initial + tool have plenty of options/features without doing separate commits + for each one. +- Don't bother to split back end commits from front end commits, even + though the backend can often be coherent on its own. + +Other considerations: + +- Overly fine commits are easily squashed, but not vice versa, so err + toward small commits, and the code reviewer can advise on squashing. + +Commit Messages +--------------- + +- The first line of commit messages should be written in the imperative + and be kept relatively short while concisely explaining what the + commit does. For example: + +Bad: + +| ``bugfix`` +| ``gather_subscriptions was broken`` + +Good: + +``Prevent gather_subscriptions from throwing an exception when given bad input.`` + +- Please use a complete sentence, ending with a period. + +- The rest of the commit message should be written in full prose and + explain why and how the change was made. If the commit makes + performance improvements, you should generally include some rough + benchmarks showing that it actually improves the performance. + +Third party code +---------------- + +When adding new third-party packages to our codebase, please include +"[third]" at the beginning of the commit message. You don't necessarily +need to do this when patching third-party code that's already in tree. diff --git a/docs/index.rst b/docs/index.rst index 26a46ddda6..2d41c5b1be 100644 --- a/docs/index.rst +++ b/docs/index.rst @@ -11,7 +11,7 @@ Contents: .. toctree:: :maxdepth: 2 - + Code Style Indices and tables ==================