Development Guidelines

How we build: quality by default, self-steering execution, an unambiguous definition of done, and security treated as a property of every phase — not a gate at the end. These guidelines are stack-agnostic and assume a GitHub-based workflow.

The goal is autonomy with accountability: criteria clear enough that any engineer can pick up work, drive it to done, and verify it themselves — without constant clarification, and without anyone having to trust that “it works.”


0. Core Principles🔗

  • Make work verifiable. A task without a checkable success condition isn’t ready to start. Turn intent into criteria before writing code.
  • Ship small. Smaller changes are easier to review, test, revert, and reason about. Breaking changes are split out, never bundled.
  • Quality is built in, not inspected in. Tests, types, and review are part of the work, not a phase after it.
  • Security is everyone’s job, in every phase. Threats are considered at design time, defended at implementation, checked at review, and enforced in CI.
  • The main branch is always releasable. If main is red or broken, fixing it is the team’s top priority.

1. Self-Steering Work🔗

Define success, then loop until verified. Don’t wait to be told you’re done.

Transform every task into a goal with a checkable condition:

  • “Add validation” → “Tests cover the invalid inputs; they fail before the change and pass after.”
  • “Fix the bug” → “A test reproduces the bug, fails on main, passes on the branch.”
  • “Refactor X” → “The existing test suite passes before and after; behaviour is unchanged.”

Strong criteria let you work independently. Weak criteria (“make it work”) force constant check-ins. When you catch yourself unsure whether something is done, that’s a sign the criteria were never defined — stop and define them.

Plan in steps, finish one before the next🔗

For anything larger than a single function, write a short plan and execute it one step at a time. Each step finishes — code and its tests — before the next begins.

1. [Step] → verify: [check]
2. [Step] → verify: [check]
3. [Step] → verify: [check]

Split by change size. Use semver as a yardstick: anything above a minor change — i.e. a breaking / major change — is its own step (and usually its own PR). Never sneak a breaking change into a larger pass.

The step breakdown is also your schedule check: with the task’s target time (set at assignment) divided across the steps, your progress through them tells you whether you’re on track. If the early steps are already eating the budget, escalate early — a heads-up at 30% is useful, a surprise at the deadline isn’t.

Know when to proceed and when to stop🔗

  • Proceed when the criteria are clear and the path is within the spirit of the request.
  • Stop and ask when there are multiple reasonable interpretations, when a simpler approach exists, or when the request would require a breaking change that wasn’t anticipated. Name what’s unclear; don’t pick silently.

Work in the open🔗

Always make your work visible. Others should be able to see where a piece of work is heading without having to ask.

  • Push at the end of each day. Work-in-progress on a branch is visible progress; work only on your laptop is invisible and at risk.
  • Document as you go — what you researched, what you ruled out, why the current direction. Link the resources you leaned on.
  • This is what makes handover painless: when the whole process is on the issue and the branch, someone else can pick it up from where you left off.

2. Definition of Done🔗

“Done” is a checklist, not an opinion. A change is done when it can be released without further explanation.

Per-change (every PR)🔗

  • Solves exactly what was asked — no speculative features, no unrequested refactors.
  • Tests added/updated; they fail without the change and pass with it.
  • Full CI suite is green (tests, linters, type checks, security scans).
  • Public behaviour, interfaces, and non-obvious decisions are documented.
  • Inputs from trust boundaries are validated; failure modes are explicit.
  • No secrets, credentials, or sensitive data in code, history, or logs.
  • Reviewed and approved per branch protection rules.
  • The change is reversible (clean revert, or a documented rollback path).

Per-feature (epic / milestone)🔗

  • Acceptance criteria from the issue are all met and demonstrable.
  • User-facing docs / changelog updated.
  • Observability in place (logs, metrics, or traces for the new path).
  • Security review completed for anything touching auth, data handling, or external input.
  • Rollout and rollback plans agreed for anything risky.

If an item doesn’t apply, say so in the PR — don’t silently skip it.


3. Quality🔗

Readable, tested, typed, documented — by default, not on request.

Code🔗

  • Simplicity first. The minimum code that solves the problem. No abstractions for single-use code, no configurability nobody asked for, no error handling for impossible internal states. If 200 lines could be 50, rewrite it.
  • Surgical changes. Touch only what the task requires. Match existing style. Don’t “improve” adjacent code. Remove only the orphans your change created; flag pre-existing dead code rather than deleting it.
  • Stay in scope. Noticing a minor annoyance mid-task is not license to fix it. File it as its own issue and keep the current change focused — scope creep is how a one-line fix becomes an unreviewable PR.
  • Types where the language supports them. Annotations, generics, typed signatures — use them. They are documentation that the compiler checks.
  • Modern and idiomatic, favouring stable current features over legacy patterns — but not novelty for its own sake.
  • Every language has an enforced ruleset. A formatter and a linter are configured per language, committed to the repo, and run in CI (§5) — style is automated, never argued in review. Introducing a language with no existing ruleset means bringing at least a minimal one (a standard formatter and the language’s default/recommended lint set) in the same change.

Documentation🔗

  • Every function carries at least a short comment stating its purpose, inputs/outputs, and any non-obvious behaviour. Good names complement these comments; they don’t replace them.
  • For non-trivial logic, comment the why, not the what.
  • For third-party integrations, link the specific API endpoint/method in a comment — not just the product homepage.

Tests🔗

  • Tests are first-class deliverables and live with the code they cover.
  • Cover the contract (expected behaviour) and the edges (invalid input, boundaries, failure paths) — not just the happy path.
  • A flaky test is a broken test. Fix it or quarantine it with a tracking issue; never let it train the team to ignore red.
  • Coverage is a signal, not a target. Don’t write tests to move a number.

Review🔗

  • Review for correctness, clarity, security, and scope — in that order.
  • Small PRs get fast, careful reviews. Large PRs get rubber stamps. Keep them small.
  • Reviews are about the code, not the author. Be direct and kind.

Tooling & environment🔗

  • Work in devcontainers where possible. A reproducible, containerized dev environment keeps everyone on the same toolchain and, crucially, keeps dependency and project code from executing directly on your host. Minimize what you run on your localhost — the less untrusted code touches your machine, the smaller the blast radius if something is malicious.
  • AI assistants are allowed, but you own the output. You may use AI to help produce code, tests, or docs, but you are fully responsible for the accuracy, security, and quality of the result — review it as critically as anything you wrote yourself. “The AI generated it” is never an explanation for a defect.
  • Company-vetted solutions only. Tools, services, and AI assistants must be ones the company has vetted and approved. Don’t route code or data through unapproved tooling.

4. Security in All Phases🔗

Shift left: the cheapest place to fix a vulnerability is before it’s written. Security is not a final gate — it shows up in design, code, review, CI, and operations.

Design🔗

  • Identify trust boundaries: where does untrusted input enter (users, external APIs, files, queues)? Defences belong at those boundaries.
  • Threat-model anything touching authentication, authorization, secrets, payments, or personal data. Ask “what could an attacker do here?” before writing code.
  • Default to least privilege and deny-by-default for access and permissions.

Implementation🔗

  • Validate at trust boundaries. Reject invalid external input clearly rather than continuing in a bad state. (This is about real external input — not a license to handle impossible internal states.)
  • Never construct queries, commands, or markup by string concatenation of untrusted input. Use parameterization, safe APIs, and contextual encoding.
  • No secrets in code, config, or commit history. Use a secrets manager or injected environment variables.
  • Pin every external dependency by cryptographic hash, not by version number — a version string is mutable and can be re-pointed at malicious content, a hash cannot. Commit a lockfile that records integrity hashes for the full transitive tree, and have the toolchain verify them on install. This applies to all external artifacts: language packages, container base images, and CI actions/images alike.
  • Fail closed: on error, deny access and reveal nothing sensitive in messages or logs.

Review🔗

  • Treat security as a first-class review concern, not an afterthought.
  • Require explicit review for changes to auth, crypto, data handling, or any trust boundary — ideally via CODEOWNERS (see §5).
  • Adding a dependency is a deliberate decision made with extreme caution: prefer not to, prefer the standard library, prefer code you can read. When you do add one, weigh maintainer, activity, transitive footprint, and license — and ask a colleague to help review its quality. Don’t approve your own dependency additions in isolation.

CI / CD (automated, enforced)🔗

  • SAST — static analysis on every PR.
  • Dependency / SCA scanning — flag known-vulnerable dependencies (e.g. Dependabot alerts) and fail on critical findings.
  • Secret scanning — block pushes that contain credentials (e.g. GitHub push protection).
  • Pin and verify — pin CI actions/images to a commit digest (not a tag); verify dependency lockfile integrity on install and fail on mismatch; scope CI tokens to least privilege.

Operations🔗

  • Log security-relevant events (authn/authz decisions, failures) — without logging secrets or sensitive data.
  • Have a patching path: a known-vulnerable dependency should be upgradeable and releasable quickly.
  • Know the rollback story before you ship.

5. GitHub-Based Workflow🔗

The repository is the source of truth, and the process is encoded in it — in branch protection, CODEOWNERS, templates, and CI — so the right thing is the default thing.

Branching🔗

  • Trunk-based: short-lived branches off main, merged back quickly.
  • Two prefixes only: feature/... for features, bugfix/... for fixes.
  • Keep branches short-lived. Long-lived branches drift and rot; rebase or merge main in frequently.

Commits🔗

  • Small, atomic commits with messages that explain why, not just what.
  • The message describes what the commit actually does — message and diff match. A commit that claims one thing and does another is a defect in its own right.
  • Follow Chris Beams’ “How to Write a Git Commit Message” for form: imperative subject ≤50 chars, capitalized, no trailing period; blank line; body wrapped at ~72 explaining what and why. See https://chris.beams.io/posts/git-commit/.
  • Every commit declares its semver impact with an Impact: trailer: Impact: major, Impact: minor, or Impact: patch. Use major for any breaking change.
  • Every commit references the issue it serves, using Related: (contributes to) or Closes: (completes it). A commit with no issue reference doesn’t merge.
  • Commits are cryptographically signed (GPG or SSH) and show as verified by the host. Unsigned commits don’t merge — enforced by branch protection.
  • An optional Severity: trailer (critical | high | medium | low) signals how urgently the change needs to reach production. This is distinct from Impact:: impact is the size of the change (semver), severity is the urgency of deploying it.
Tighten upload size validation

Reject payloads over the configured limit at the boundary instead of
buffering them, closing a memory-exhaustion vector.

Impact: minor
Severity: high
Closes: PROJ-1234

Pull Requests🔗

  • One logical change per PR. If you can’t summarize it in a sentence, split it.
  • Open as a draft while in progress; mark ready when the per-change Definition of Done (§2) is met.
  • Use a PR template that prompts for: what changed, why, how it was verified, and security/rollback considerations.
  • Link the issue the PR resolves. PRs without context are hard to review and harder to audit later.
  • The implementor sets reviewers and testers up to succeed: the PR spells out how to review and how to test — what to run, the expected behaviour, edge cases worth checking, and anything non-obvious about the approach. A reviewer should never have to guess how to verify the work.
  • Coordinate the merge itself with stakeholders and any components that depend on the change — a merge that breaks or surprises a downstream consumer isn’t done well, even if the code is correct. Time it together where there’s coupling.
  • A PR that touches a public interface (an API, schema, event, or any contract others depend on) clears a higher bar: more review scrutiny and explicit confirmation, plus a coordinated roll-out with the consumers of that interface. Breaking a published contract silently is never acceptable.

Reviews & testing🔗

  • CODEOWNERS routes reviews automatically and mandates the right reviewers for sensitive areas (auth, infra, data, security-critical paths).
  • At least one approving review before merge; more for high-risk areas.
  • Authors don’t merge their own unreviewed code.

What a review covers. Style is automated (§3); a human review is for the things a linter can’t see:

  • Logic and correctness — does the implementation actually do the right thing? This is the heart of the review, not an afterthought to formatting.
  • Faithfulness to the issue — the implementation matches the request. Any deviation in scope or approach is legitimate only if a documented discussion shows it was agreed, and the issue itself was updated to reflect it. Undocumented extras or silent scope drift don’t pass review.
  • Tests — are they valid, and extensive enough to cover the contract and the edges?
  • Fixing a minor typo inline is fine; the focus is always on whether the solution is valid, not on cosmetics.

Review is a dialogue. It’s a conversation between reviewer and implementor, not a one-way gate:

  • The reviewer should come away understanding what the change does, why it’s needed, and why this design was chosen — if they can’t, the change isn’t ready or isn’t explained well enough.
  • Implementors explain their reasoning; reviewers ask until they understand.
  • A reviewer with a different idea can challenge the original approach and propose alternatives — that’s the point of review, not friction to avoid.

Recording what was done. Reviewers and testers state what they did and what they did not do — an approval is a record of verification, not a claim of faith:

  • How the implementation was reviewed (what was read, what was reasoned through, what was checked against the issue).
  • How it was tested, with full concrete examples: the actual commands run and their shell output, the inputs given and the outputs observed. This is the evidence of what was genuinely exercised — paste it, don’t summarize it.
  • Anything deliberately not covered, so the gap is visible rather than assumed closed.

Review turnaround. As a rule of thumb, how fast a ready PR should get a review tracks the issue’s priority (§ Intake, priority & assignment). These are expectations, not hard SLAs — use judgement, but treat a slip as something to flag, not absorb silently:

PriorityReview by
P0Immediately, as soon as it’s ready
P1End of the same day
P2Next business day
P3Within ~48 hours
P4Within ~5 working days

Branch protection (on main)🔗

  • Require pull requests — no direct pushes.
  • Require approving review(s) and resolution of all conversations.
  • Require status checks to pass: tests, linters, type checks, and security scans.
  • Require branches to be up to date before merge.
  • Require signed commits.
  • Enable secret-scanning push protection.
  • Restrict who can push/merge; no force-pushes to main.

CI/CD with GitHub Actions🔗

CI is the enforcement point for the Definition of Done. Every gate that matters is automated; a green check means “releasable,” not “probably fine.” A stack-agnostic shape:

# .github/workflows/ci.yml — gates are named; fill the run steps per stack.
name: CI
on:
  pull_request:
  push:
    branches: [main]

permissions:
  contents: read          # least privilege; grant more only per-job as needed

jobs:
  verify:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@<pinned-sha>
      - name: Lint            # style / static checks
        run: <lint command>
      - name: Type-check      # if the language supports it
        run: <typecheck command>
      - name: Test            # unit + integration; must fail on regressions
        run: <test command>
      - name: SAST            # static security analysis
        run: <sast command>
      - name: Dependency scan # known-vulnerable dependencies
        run: <sca command>

Principles for the pipeline regardless of stack:

  • Fail fast and loudly. A failing gate blocks merge; red is never normal.
  • Least privilege. Default permissions: read; widen per job only when required. Scope deploy credentials tightly.
  • Pin actions to a commit SHA or trusted release — third-party Actions run with access to your pipeline.
  • Same checks locally. Provide the equivalent commands (pre-commit hooks, a make/task target) so engineers verify before pushing and the loop is fast.

Releases🔗

  • Versions follow Semantic Versioning (MAJOR.MINOR.PATCH). The bump is the highest Impact: trailer among the commits since the last release (major > minor > patch).
  • Releases are cut as signed, annotated tags from main — tag signatures are required, so a release’s provenance is verifiable.
  • Maintain a changelog (generated or curated) so consumers know what changed and whether anything is breaking.
  • Releases are planned and coordinated, because communicating features to users matters as much as shipping them. The goal is a regular weekly release cadence with release controls in place — we may not be there yet, but that’s the direction.
  • Target cadence, as a rule of thumb (not strict): at most one major per month, about one minor per week, and patches as needed. This gives a rough pipeline — plan in week N, implement in N+1, review in N+2, release in N+3 — so work flows at a predictable, communicable rhythm.
  • Before implementing a feature, have an idea of which version it ships in. Knowing the target release lets the feature be planned, sequenced, and communicated ahead of time rather than announced after the fact.

Versioning & support🔗

  • Public interfaces are versioned. Consumers depend on contracts, so a contract carries a version and changes follow SemVer (a breaking change is a new major).
  • When building a newer interface, re-implement the old interface on top of the new code behind the scenes — keep the old contract identical to callers while it runs through the new implementation. This lets old internal code be retired without dropping the interfaces people still rely on.
  • Support policy (to align with our obligations under the EU Cyber Resilience Act, CRA):
    • Only the latest major version is supported. When a new major ships, support for the previous major ends 4 weeks later — a deliberate overlap window for consumers to migrate.
    • A minor version is supported until the next minor is released.

Issues & work intake🔗

Work is tracked in an issue tracker that is the source of truth for what and why, with the code host owning how. The tracker is treated as system-agnostic here — currently Jira, with some legacy items still in GitHub Issues; the guidance below holds regardless of tool.

Every unit of work is an issue before it is a branch. Commits and PRs reference it (§ Commits). No tracked issue, no merge.

Writing the request. Keep it short but complete enough that someone else could pick it up cold:

  • Feature request — the problem and who has it, the desired outcome (not a prescribed solution), scope boundaries (what’s explicitly out), and any security/data-handling implications.
  • Bug report — observed vs. expected behaviour, exact reproduction steps, environment/version, and impact/severity. Include an estimate of how often it happens: something reproducing every time is high or very high probability; “I saw it once” is low. The frequency estimate, alongside impact, drives the priority.

Triaging incoming reports. Bug reports are reviewed promptly, paced by priority: P0 and P1 immediately, the rest by the next business day. A report nobody triages can’t be prioritized.

Acceptance criteria turn the request into the per-feature Definition of Done (§2). Make them:

  • Testable — each criterion is something a reviewer can check pass/fail, not a vibe. Prefer the form “Given … when … then …”.
  • Observable from outside — phrased as behaviour or outcomes, not implementation steps.
  • Complete on the edges — include the failure and invalid-input paths, not just the happy path.
  • Bounded — if a criterion can’t be met in one reasonably-sized change, the issue is too big; split it.

Keeping stakeholders in sync:

  • The issue is the single record. Decisions, scope changes, and trade-offs are written back to it — not left in chat or DMs.
  • Name the stakeholders (requester, reviewer/owner, affected teams) on the issue so notifications reach the right people.
  • Material scope or acceptance-criteria changes are agreed with the requester and owner before implementation continues, and recorded on the issue.
  • Status lives on the issue and is kept current, so anyone can see where things stand without asking.

Audience & collaboration:

  • Know who the change is for before you build it. Identify the target audience (end users, another team, an internal service) and involve them — their input shapes whether the result is actually right, not just whether it works.
  • Ask for help. Pulling in expertise from other teams when you’re outside your depth is expected, not a weakness; it’s cheaper than guessing wrong.
  • Staff projects for breadth. A project should always include people whose combined knowledge spans the whole field it touches, so no major area is left to chance.
  • Document dependencies on other projects — as explicit dependency documents, or otherwise recorded somewhere discoverable — so cross-project coupling is visible and can be coordinated rather than discovered late.

Projects:

  • Projects have a clear structure and start from a business driver — reduce manual work, fix a class of issues, deliver a customer-facing feature, and so on. The “why this exists” is set by the business.
  • Planning is top-down in granularity: the business and leads set high-level goals and shape; the detailed “how exactly to implement” is decided within the project team, by the people doing the work. Don’t over-specify implementation from the top, and don’t leave the goal vague at the bottom.

Intake, priority & assignment:

  • All new work requests enter through a team lead or project manager — the single intake point that triages, prioritizes, and assigns. No feature reaches a branch without lead/PM review and acceptance; features are never self-started.
  • Every issue carries a priority from P0 (highest) to P4 (lowest), set at intake. Priority drives what gets scheduled and what may interrupt a sprint.
  • Bugfixes already triaged in the backlog are generally free to pick up — if it’s broken and you can fix it, take it (via its issue and the normal review path).
  • Features are assigned by the lead/PM so priorities and capacity stay coordinated; don’t self-assign net-new feature work.
  • Handover review. When a task is handed out, the assignee and the person assigning it review it together before work starts, to confirm scope, acceptance criteria, and what’s out of scope are mutually understood. Align here, not three days into the branch. Set a target time for the work, so there’s a shared sense of when it’s running long and should be re-examined.
  • Impact assessment. As part of that review, assess and record on the issue what the change touches: affected audiences/users, other components and services, interfaces and contracts (APIs, schemas, events), and any security, data, or migration implications. This sets the commit’s Impact: level and surfaces cross-team coordination early.

Sprints:

  • A sprint’s scope is fixed once it starts. Only the tech lead may inject a new task mid-sprint, and only to mitigate an emergency.
  • Only P0 or P1 issues may be added to a sprint in progress; P2–P4 work waits for the next planning cycle.

Time tracking:

  • Time spent implementing features and bugfixes is tracked, tagged with the issue identifier where applicable. This serves accounting requirements and sharpens planning by making visible where effort actually goes.

Quick reference🔗

PhaseQuality gateSecurity gate
DesignVerifiable acceptance criteriaThreat model trust boundaries
ImplementTypes, tests, docs, simplicityValidate input, no secrets, safe APIs
ReviewCorrectness, clarity, scopeCODEOWNERS review of sensitive paths
CITests + lint + type-check greenSAST + SCA + secret scanning pass
MergeBranch protection satisfiedUp-to-date, signed-off, reversible
OperateObservability in placeLogging, patch path, rollback ready

The test for any change: every line traces to the request, every claim of “done” is backed by a check someone else can run, and nothing crossed a trust boundary without being validated.