A beginner guide to the folder structure generated by cookiecutter-pypackage

cookiecutter-pypackage offers a very well equipped standard project template to create a Python package. However for many first time users, the automatically generated folder structure can be quite intimidating.

The typical folder structure generated by cookiecutter-pypackage (https://github.com/cheeyeelim/cookiecutter-pypackage) (v1.1.2) looks like this :

A typical cookiecutter-pypackage folder structure

Note that all mentions of {{cookiecutter.project_slug}} and {{cookiecutter.pkg_name}} will be replaced by the user supplied strings when you create the project using cookiecutter https://github.com/cheeyeelim/cookiecutter-pypackage.git (which is derived from https://github.com/waynerv/cookiecutter-pypackage).

(For steps on how to use cookiecutter-pypackage, please refer to https://cheeyeelim.github.io/cookiecutter-pypackage/latest/.)

We will breakdown the key files and folders created below :

.github folder

.github
|- workflows
|-- dev.yml
|-- preview.yml
|-- release.yml
|- ISSUE_TEMPLATE.md

.github contains configurations used by GitHub repos. The YAML files in workflows folder specify the CI/CD steps to run using GitHub Actions (https://docs.github.com/en/actions). Each YAML file represents a workflow that can be triggered differently and can contain different steps. For example, release.yml workflow is only triggered when a push event occurs for a tag and it will then process the repo for publication of Python library to PyPI.

ISSUE_TEMPLATE.md is used as the default template when a user creates an issue on GitHub for this specific project.

{{cookiecutter.pkg_name}} folder

{{cookiecutter.pkg_name}}
|- __init__.py
|- {{cookiecutter.pkg_name}}.py
|- cli.py

{{cookiecutter.pkg_name}} should be quite self-explanatory. This folder should hold all the core Python codes needed by the Python package. __init__.py contains the meta information regarding the Python package, such as author and version.

{{cookiecutter.pkg_name}}.py represents the main entry point of a Python package. cli.py specifies the available command line entry points (i.e. allowing this Python package to be run directly as a shell command, rather than through Python script). Command line interface will be provided by click (https://click.palletsprojects.com/).

docs folder

docs
|- api.md
|- changelog.md
|- contributing.md
|- index.md
|- installation.md
|- usage.md

docs folder holds Markdown documents that will be built by mike (https://github.com/jimporter/mike) (based on mkdocs) into documentation for this Python package. It contains default text that should apply for most projects, but do manually check and verify them.

index.md and changelog.md automatically loads from README.md and CHANGELOG.md files at the folder root. This is to ensure that processes that have dependencies on README.md and CHANGELOG.md are able to find them easily (e.g. README.md at folder root will be used as a repo README).

api.md is another special Markdown file that will contain documentations on APIs that are automatically created from docstrings. This allows the creation of API documentation with relatively little efforts.

tests folder

tests
|- __init__.py
|- test_{{cookiecutter.pkg_name}}.py

tests folder is the standard folder that holds testing scripts for a Python package. A very simple test script template is provided in test_{{cookiecutter.pkg_name}}.py. These test scripts will later be run using tox (https://tox.wiki/en/latest/) and pytest (https://docs.pytest.org/), which allow testing under multiple system configurations easily in one go.

{{cookiecutter.project_slug}} folder – project root

{{cookiecutter.project_slug}} (project root)
|- .bumpversion.cfg
|- .editorconfig
|- .gitignore
|- .pre-commit-config.yaml
|- CHANGELOG.md
|- LICENSE
|- makefile
|- mkdocs.yml
|- poetry.toml
|- pyproject.toml
|- README.md
|- setup.cfg

There are many files at project root level.

.bumpversion.cfg is the configuration file for bump2version (https://github.com/c4urself/bump2version), which helps updating all version strings in the source code with a single command.

.editorconfig (https://editorconfig.org/) is a file that defines coding styles and text editor configurations for multiple IDEs.

.gitignore should be a well known file for anyone who has worked with Git before. It contains a list of patterns matching files and folders that should be excluded from Git tracking.

.pre-commit-config.yaml is the configuration file for pre-commit (https://pre-commit.com/). pre-commit introduces commands (usually linters and auto-formatters) that run automatically right before git commit, and will stop the commit if any check is failed.

CHANGELOG.md should be used by package author to record features added and bug fixes associated with each version. It should follow the changelog format as defined at https://keepachangelog.com/en/1.0.0/.

LICENSE is another common file that specifies the copyright license associated with the Python package. cookiecutter-pypackage helps generate this file based on user-specified license type during project setup.

makefile contains many commonly used commands with recommended default parameters to be used with the make command. For example, make clean will clean up temporary files and caches generated during the development process.

mkdocs.yml is the configuration file for mkdocs (https://www.mkdocs.org/), that helps generate documentations using the templates specified in docs folder.

poetry (https://python-poetry.org/) is an amazing Python packaging and dependencies manager that I highly recommend. poetry.toml is the configuration file for general poetry behaviour. Currently poetry.toml contains instructions to make poetry stores virtual environments under project folder, rather than centrally.

pyproject.toml contains general Python project-specific build configurations as specified in PEP 518 (https://www.python.org/dev/peps/pep-0518/). pyproject.toml contains meta information on the Python package that will be used by poetry for PyPI release. It also contains all dependencies information with associated version constraints that poetry uses to install and manage virtual environments with. However do note that pyproject.toml can also contain configurations for other tools such as isort.

README.md, as indicated in its name, is the file that contains overview information that introduces users to the Python package. It is usually the first document that a new user sees, so it is helpful to keep important information to help the new user gets started in using the Python package (e.g. how to install, how to run the tool).

setup.cfg is another configuration file that contains general Python project-specific build configurations. setup.cfg complements pyproject.toml by helping to specify configurations for other tools such as flake8 and tox.

Hopefully this long post can help ease your transition into start working with a Python-based cookiecutter template!

Leave a comment

Your email address will not be published. Required fields are marked *