.. This file is part of Invenio. Copyright (C) 2018 CERN. Invenio is free software; you can redistribute it and/or modify it under the terms of the MIT License; see LICENSE file for more details. .. _build-repository: Repository structure ==================== This section describes the file structure of a standard invenio project. The cookiecutter generates this skeleton for you. Note that the following file structure is our default recommandation, however you are completely free to adapt it as your see fit. Management scripts ------------------ .. code-block:: shell ... ├── scripts │   ├── bootstrap │   ├── console │   ├── server │   ├── setup │   └── update ... In your root folder, you will find the ``scripts`` directory which contains executable bash scripts that will assist you with developing and managing your Invenio instance: **scripts/bootstrap** Installs all of the Python dependencies, your application's code, and collects and builds the static files required for the instance to run. We'll talk more about how we manage `dependencies`_ in the relevant section below. **scripts/setup** (Re)initializes data needed for services that hold application state, i.e.: - Database tables - Elasticsearch indices and templates - RabbitMQ queues - Redis databases This script is also useful when you're doing local development and want to start from a clean state. .. warning:: This scripts performs destructive and non-reversible operations. Only run this when you initialize your instance the first time. Running this in e.g. a production or testing environment will remove all existing data. **scripts/server** Fires up a development HTTPS-enabled flask web server at https://localhost for your application and a Celery worker. As you make HTTP requests to the web application or run any tasks you will see information, warnings and errors being logged in the terminal. Interrupting this script will automatically stop both services. **scripts/console** This will spawn an interactive IPython shell with your application fully loaded. You can use it to run arbitrary Python commands while having access to your application's database models for queries. This is a great tool for testing functionality during development, troubleshooting and fixing problems on a live instance **scripts/update** This will repeat all of the steps of the ``bootstrap`` script, but will also additionally apply any new Alembic recipes for the database and Elasticsearch index changes. .. _dependencies: Python dependencies and packaging --------------------------------- .. code-block:: shell ... ├── Pipfile ├── Pipfile.lock ├── setup.py ├── MANIFEST.in ... To manage our Python dependencies we have chosen to use `pipenv `_. Pipenv does the following: - Tracks your *loose* Python dependencies inside ``Pipfile``. - Pins specific versions (and hashes) of your Python depedendencies inside ``Pipfile.lock``. The existence of this file is essential to make sure that when you deploy your instance on a production environment, you can reproduce the exact same environment that you used when you developed and tested your application. - Automatically creates a Python virtualenv with the correct Python version under the path defined in the ``WORKON_HOME`` environment variable (commonly used by ``virtualenvwrapper``). If not set, new virtualenvs will be placed under ``$HOME/.local/share/virtualenvs/``. We still need a ``setup.py`` file though, not for tracking any dependencies, but for specifiyng the entrypoints that various Invenio packages rely on to automatically detect and register Flask blueprints, Celery tasks and other features. Docker and Docker-Compose ------------------------- .. code-block:: shell ... ├── docker │   ├── postgres │   │   ├── ... │   ├── uwsgi │   │   ├── ... │   ├── nginx │   │   ├── ... │   ├── haproxy │   │   ├── ... ├── docker-services.yml ├── docker-compose.yml ├── docker-compose.full.yml ├── Dockerfile.base ├── Dockerfile ... The instance requires some services in order to run, like a database, Elasticsearch, Redis and RabbitMQ. To provide a cross-platform and convenient way of running these services, we are using Docker and Docker Compose, by configuring the following files: **docker-services.yml** This file contains basic definitions for the Docker containers for the services the instance uses. Configuration options such as the database credentials, exposed ports, and other service-specific options can be modified in here. This file's containers are used as a common base and are extended by other ``docker-compose.*.yml`` files to build up a specific configuration for an infrastructure. **docker-compose.yml** This file contains and exposes locally the minimal set of service containers needed for developing the instance locally: - ``db``: The database, PostgreSQL or MySQL, exposing the 5432 or 3306 ports. - ``es``: Elasticsearch version 6 or 7, exposing the 9200 and 9300 ports. - ``mq``: RabbitMQ, exposing port 5672 for the service and port 15672 for a management web server (accessible via the default username/password ``guest:guest``). - ``cache``: Redis exposing port 6379. When developing and running your instance locally these services can be accessed by your application. **docker-compose.full.yml** This file contains a full-fledged definition of a more scalable application infrastructure. It has all of the ``docker-compose.yml`` file's containers defined, and additionally: - ``lb``: HAProxy, publicly exposing ports 80 and 443 for accessing the web application and 8080 for accessing statistics. - ``frontend``: Nginx, exposing ports 80 and 443 and acting as a reverse proxy for your application containers and serving static files. - ``web-ui``/``web-api``: Two separate web application containers running uWSGI for the Invenio UI and REST API applications and exposing port 5000 - ``worker``: The Celery worker of your application. - ``flower``: Monitoring web application for Celery, publicly exposing port 5555. - ``kibana``: Monitoring web application for Elasticsearch, publicly exposing port 5601. The ``web-ui``, ``web-api`` and ``worker`` containers are using Docker images that are built from the ``Dockerfile.base`` and ``Dockerfile`` files described below. .. warning:: This file is **not** intended to be used for production, neither as reference for a production infrastructure. It is just an example of a more complete application deployment. **Dockerfile.base** This Dockerfile helps you build a Python dependencies-only base image from where your application can be built quickly. **Dockerfile** This Dockerfile builds a fully functional image of your application with all of the static assets it requires. **docker/postgres** Contains a Dockerfile and script that will setup the necessary users/roles for the database. **docker/uwsgi** Contains a the ``uwsgi_ui.ini`` and ``uwsgi_api.ini`` uWSGI configruation files used for running the Invenio UI and REST API web applications. **docker/nginx** Contains a Dockerfile, nginx configurations (``nginx.conf`` and ``conf.d/default.conf``) and a self-signed generated SSL certificate (``test.crt`` and ``test.key``). You can look into these files if you are interested in how to confiugre nginx to proxy requests to one or multiple uWSGI web application. **docker/haproxy** Contains a Dockerfile, HAProxy configuration (``haproxy.cfg``) and a self-signed generated SSL certificate (``haproxy_cert.pem``). Configuration ------------- .. code-block:: shell ... ├── my_site │   ├── config.py ... **my_site/config.py** The instance's basic configuration variables are defined inside this file. You should go through all of these variables to understand what kind of things can be customized for your instance, like e.g. what should be the "From" email address for your automatically sent emails. The configuration used by the Invenio applications is dynamically loaded from multiple sources. You can read more about this in `Invenio-Config documentation `_. Probably the most important part of this, is the order in which the various configuration sources are loaded, which allows you to effectively override any config variable. The following list describes this order (every item overrides the one above it): - Configuation modules defined in ``invenio_config.module`` entrypoints. ``my_site.config`` is actually one of them. You can add as many as you want and they will be applied in alphabetical order of the entrypoint name. - Configuration in the ``/invenio.cfg``. For local development this is usually ``${VIRUAL_ENV}/var/instance/invenio.cfg``. - ``INVENIO_XYZ`` environment variables. If for example you want to override the ``SECRET_KEY``, you would have to do ``export INVENIO_SECRET_KEY="my-secret"``. Tests ----- .. code-block:: shell ... ├── tests │   ├── api │   │   ├── conftest.py │   │   └── test_api_simple_flow.py │   ├── e2e │   │   ├── conftest.py │   │   └── test_front_page.py │   ├── ui │   │   └── conftest.py │   ├── conftest.py │   └── test_version.py ├── pytest.ini ├── run-tests.sh ... In Invenio we're using the Python `pytest `_ library for testing. All of the instance's tests are placed in the ``tests/`` directory. **tests/ui/** Includes tests that use the UI application views. **tests/api/** Includes tests that use the REST API application views. **tests/e2e/** Includes Selenium-based end-to-end tests which access both the UI and REST API applications. **pytest.ini** Used to configure ``pytest`` and its various plugins. **run-tests.sh** You can run this script locally or in your CI/CD pipeline and it will check: - Your Python dependencies for security vulnerabilities using `pyup.io's "safety" library `_. - Your docs styling based on `PEP 257 `_. - Your Python import for the correct sorting order using `isort `_. - Your ``MANIFEST.in`` for any missing entries. - Your docs are building without errors. - That your tests are passing. Documentation ------------- .. code-block:: shell ... ├── docs │   ├── api.rst │   ├── authors.rst │   ├── changes.rst │   ├── configuration.rst │   ├── conf.py │   ├── contributing.rst │   ├── index.rst │   ├── installation.rst │   ├── license.rst │   ├── make.bat │   ├── Makefile │   ├── requirements.txt │   └── usage.rst ├── AUTHORS.rst ├── CHANGES.rst ├── CONTRIBUTING.rst ├── INSTALL.rst ├── README.rst ... To build the instance's documentation we're using `Sphinx docs `_ and `reStructuredText `_ as a markup language. **docs/*.rst** The various ``.rst`` files are placed in the root of your repository and in the ``docs/`` directory, and will be used to build your instance's documentation, via running ``pipenv run build_sphinx``. **docs/conf.py** This is the place where various documentation configuration variables can be set. You can have a look at it and tweak things based `Sphinx docs' extensive section on its configuration `