Monthly notes 50

Issue 50, 15.6.2020

Serverless

AWS Lambda — should you have few monolithic functions or many single-purposed functions?
Interesting question of if single responsibility principle (SRP) should be followed in the serverless world. What is a “function” if not SRP? TL;DR; many single-purposed functions are better.

Stories

Twitter search of "telling early-in-career engineers stories of times you messed something up real bad is a good way to help them combat their own impostor syndrome." from (@ElleArmageddon)

Kubernetes

In Kubernetes, what should I use as CPU requests and limits?
Good Twitter thread of what are the difference of requests and limits.

How should I answer a health check?
Explains how to use liveness and readiness probes (on Kubernetes). Heard that liveness probe should be always off unless there’s a bug in app which it can’t recover. And long checks can be cached.

Managed Kubernetes Price Comparison (2020)
"TL;DR: Azure and Digital Ocean don’t charge for the compute resources used for the control plane, making AKS and DO the cheapest for running many, smaller clusters. For running fewer, larger clusters GKE is the most affordable option. Also, running on spot/preemptible/low-priority nodes or long-term committed nodes makes a massive impact across all of the platforms."

Learning

Performance profiling for Web Applications with Sam Saccone
"How to use Chrome DevTools to understand a Web application's performance bottlenecks. Goes over a few different workflows that will help us to answer the question "Why is this slow and how can I fix it"."

Tools

OpenSnitch
GNU/Linux port of the Little Snitch application firewall. (from Hacker Newslettter #490, comments)

Kubectl-debug
kubectl-debug is an out-of-tree solution for troubleshooting running pods, which allows you to run a new container in running pods for debugging purpose (examples). The new container will join the pid, network, user and ipc namespaces of the target container, so you can use arbitrary trouble-shooting tools without pre-installing them in your production container image.

Lighthouse audit add-on for Firefox
"Report, Performance, Accessibility, PWAs, SEO scores for any public site. Without opening DevTools."

Generating JWT and JWK for information exchange between services

Securely transmitting information between services and authorization can be achieved with using JSON Web Tokens. JWTs are an open, industry standard RFC 7519 method for representing claims securely between two parties. Here's a short explanation and guide of what they are, their use and how to generate the needed things.

"JSON Web Token (JWT) is an open standard (RFC 7519) that defines a compact and self-contained way for securely transmitting information between parties as a JSON object. This information can be verified and trusted because it is digitally signed. JWTs can be signed using a secret (with the HMAC algorithm) or a public/private key pair using RSA or ECDSA."

jwt.io

You should read the introduction to JWT to understand it's role and there's also a handy JWT Debugger to test things. For more detailed info you can read a JWT handbook.

In short, authorization and information exchange are some scenarios where JSON Web Tokens are useful. They essentially encode any sets of identity claims into a payload, provide some header data about how it is to be signed, then calculate a signature using one of several algorithms and append that signature to the header and claims. JWTs can also be encrypted to provide secrecy between parties. When a server receives a JWT, it can guarantee the data it contains can be trusted because it's signed by the source.

Usually two algorithms are supported for signing JSON Web Tokens: RS256 and HS256. RS256 generates an asymmetric signature, which means a private key must be used to sign the JWT and a different public key must be used to verify the signature.

JSON Web Key

JSON Web Key (JWK) provides a mechanism for distributing the public keys that can be used to verify JWTs. The specification is used to represent the cryptographic keys used for signing RS256 tokens. This specification defines two high level data structures: JSON Web Key (JWK) and JSON Web Key Set (JWKS):

  • JSON Web Key (JWK): A JSON object that represents a cryptographic key. The members of the object represent properties of the key, including its value.
  • JSON Web Key Set (JWKS): A JSON object that represents a set of JWKs. The JSON object MUST have a keys member, which is an array of JWKs. The JWKS is a set of keys containing the public keys that should be used to verify any JWT.

In short, the service signs JWT-tokens with it's private key (in this case PKCS12 format) and the receiving service checks the signature with the public key which is in JWK format.

Generating keys and certificate for JWT

In this example we are using JWTs for information exchange as they are a good way of securely transmitting information between parties. Because JWTs can be signed—for example, using public/private key pairs — you can be sure the senders are who they say they are. Additionally, as the signature is calculated using the header and the payload, you can also verify that the content hasn't been tampered with.

Generate the certificate for JWT with OpenSSL, in this case self-signed is enough:

$ openssl genrsa -out private.pem 4096

Generate public key from earlier generated private key for if pem-jwk needs it, it isn't needed otherwise

$ openssl rsa -in private.pem -out public.pem -pubout

If you try to insert private and public keys to PKCS12 format without a certificate you get an error:

openssl pkcs12 -export -inkey private.pem -in public.pem -out keys.p12
unable to load certificates

Generate self-signed certificate with aforesaid key for 10 years. This certificate isn't used for anything as the counterpart is JWK with just public key, no certificate.

$ openssl req -key private.pem -new -x509 -days 3650 -subj "/C=FI/ST=Helsinki/O=Rule of Tech/OU=Information unit/CN=ruleoftech.com" -out cert.pem

Convert the above private key and certificate to PKCS12 format

$ openssl pkcs12 -export -inkey private.pem -in cert.pem -out keys.pfx -name "my alias"

Check the keystore:

$ keytool -list -keystore keys.pfx
OR
$ keytool -v -list -keystore keys.pfx -storetype PKCS12 -storepass
Enter keystore password:  
Keystore type: PKCS12
Keystore provider: SUN
Your keystore contains 1 entry
1, Jan 18, 2019, PrivateKeyEntry,
Certificate fingerprint (SHA-256): 0D:61:30:12:CB:0E:71:C0:F1:A0:77:EB:62:2F:91:9B:55:08:FC:3B:A5:C8:B4:C7:B4:CD:08:E9:2C:FD:2D:8A

If you didn't set alias for the key when creating the PKCS12 you can change it

keytool -changealias -alias "original alias" -destalias "my awesome alias" -keystore keys.pfx -storetype PKCS12 -storepass "password"

Now we finally get to the part where we generate the JWK. The final result is a JSON file which contains the public key from earlier created certificate in JWK-format so that the service can accept the signed tokens.

The JWK is in format of:

" 
{
"keys": [
….,
{
"kid": "something",
"kty": "RSA",
"use": "sig",
"n": "…base64 public key values …",
"e": "…base64 public key values …"
}
]
}
"

Convert the PEM to JWK format with e.g. pem-jwk or with pem_to_jwks.py. The key is in pkcs12 format. The values for public key's values n and e are extracted from private key with following commands. jq part extracts the public parts and excludes the private parts.

$ npm install -g pem-jwk
$ ssh-keygen -e -m pkcs8 -f private.pem | pem-jwk | jq '{kid: "something", kty: .kty , use: "sig", n: .n , e: .e }'

...

To check things, you can do the following.

Extract a private key and certificates from a PKCS12 file using OpenSSL:

$ openssl pkcs12 -in keys.p12 -out keys_out.txt

The private key, certificate, and any chain files will be parsed and dumped into the "keys_out.txt" file. The private key will still be encrypted.

To extract just the private key from p12 (key is still encrypted):

$ openssl pkcs12 -in keys.p12 -nocerts -out privatekey.pem

Decrypt the private key:

$ openssl rsa -in privatekey.pem -out privatekey_uenc.pem

Now if you convert the PEM to JWK you should get the same values as before.

More to read: JWTs? JWKs? ‘kid’s? ‘x5t’s? Oh my!

Notes from DEVOPS 2020 Online conference

DevOps 2020 Online was held 21.4. and 22.4.2020 and the first day talked about Cloud & Transformation and the second was 5G DevOps Seminar. Here are some quick notes from the talks I found the most interesting. The talk recordings are available from the conference site.

DevOps 2020

How to improve your DevOps capability in 2020

Marko Klemetti from Eficode presented three actions you can take to improve your DevOps capabilities. It looked at current DevOps trends against organizations on different maturity levels and gave ideas how you can improve tooling, culture and processes.

  1. Build the production pipeline around your business targets.
    • Automation build bridges until you have self-organized teams.
    • Adopt a DevOps platform. Aim for self-service.
  2. Invest in a Design System and testing in natural language:
    • brings people in organization together.
    • Testing is the common language between stakeholders.
    • You can have discussion over the test cases: automated quality assurance from stakeholders.
  3. Validate business hypothesis in production:
    • Enable canary releasing to lower the deployment barrier.
    • You cannot improve what you don't see. Make your pipeline data-driven.

The best practices from elite performers are available for all maturity levels: DevOps for executives.

Practical DevSecOps Using Security Instrumentation

Jeff Williams from Contrast Security talked about how we need a new approach to security that doesn't slow development or hamper innovation. He shows how you can ensure software security from the "inside out" by leveraging the power of software instrumentation. It establishes a safe and powerful way for development, security, and operations teams to collaborate.

DevSecOps is about changing security, not DevOps
What is security instrumentation?
  1. Security testing with instrumentation:
    • Add matchers to catch potentially vulnerable code and report rule violations when it happens, like using unparameterized SQL. Similar what static code analysis does.
  2. Making security observable with instrumentation:
    • Check for e.g. access control for methods
  3. Preventing exploits with instrumentation:
    • Check that command isn't run outside of scope

The examples were written with Java but the security checks should be implementable also on other platforms.

Modern security (inside - out)

Their AppSec platform's Community Edition is free to try out but only for Java and .Net.

Open Culture: The key to unlocking DevOps success

Chris Baynham-Hughes from RedHat talked how blockers for DevOps in most organisations are people and process based rather than a lack of tooling. Addressing issues relating to culture and practice are key to breaking down organisational silos, shortening feedback loops and reducing the time to market.

Start with why
DevOps culture & Practice Enablement: openpracticelibrary.com

Three layers required for effective transformation:

  1. Technology
  2. Process
  3. People and culture
Open source culture powers innovation.

Scaling DevSecOps to integrate security tooling for 100+ deployments per day

Rasmus Selsmark from Unity talked how Unity integrates security tooling better into the deployment process. Best practices for securing your deployments involve running security scanning tools as early as possible during your CI/CD pipeline, not as an isolated step after service has been deployed to production. The session covered best security practices for securing build and deployment pipeline with examples and tooling.

  • Standardized CI/CD pipeline, used to deploy 200+ microservices to Kubernetes.
Shared CI/CD pipeline enables DevSecOps
Kubernetes security best practices
DevSecOps workflow: Early feedback to devs <-----> Collect metrics for security team
  • Dev:
    • Keep dependencies updated: Renovate.
    • No secrets in code: unity-secretfinder.
  • Static analysis
    • Sonarqube: Identify quality issues in code.
    • SourceClear: Information about vulnerable libraries and license issues.
    • trivy: Vulnerability Scanner for Containers.
    • Make CI feedback actionable for teams, like generating notifications directly in PRs.
  • When to trigger deployment
    • PR with at least one approver.
    • No direct pushes to master branch.
    • Only CI/CD pipeline has staging and production deployment access.
  • Deployment
    • Secrets management using Vault. Secrets separate from codebase, write-only for devs, only vault-fetcher can read. Values replaced during container startup, no environment variables passed outside to container.
  • Production
    • Container runtime security with Falco: identify security issues in containers running in production.
Standarized CI/CD pipeline allows to introduce security features across teams and microservices

Data-driven DevOps: The Key to Improving Speed & Scale

Kohsuke Kawaguchi, Creator of Jenkins, from Launchable talked how some organizations are more successful with DevOps than others and where those differences seem to be made. One is around data (insight) and another is around how they leverage "economy of scale".

Cost/time trade-off:

  • CFO: why do we spend so much on AWS?
    • Visibility into cost at project level
    • Make developers aware of the trade-off they are making: Build time vs. Annual cost
      • Small: 15 mins / $1000; medium: 10 mins / $2000; large: 8 mins / $3000
  • Whose problem is it?
    • A build failed: Who should be notified first?
      • Regular expression pattern matching
      • Bayesian filter

Improving software delivery process isn't get prioritized:

  • Data (& story) helps your boss see the problem you see
  • Data helps you apply effort to the right place
  • Data helps you show the impact of your work

Cut the cost & time of the software delivery process

  1. Dependency analysis
  2. Predictive test selection
    • You wait 1 hour for CI to clear your pull request?
    • Your integration tests only run nightly?
Predictive test selection
  • Reordering tests: Reducing time to first failure (TTFF)
  • Creating an adaptive run: Run a subset of your tests?

Deployment risk prediction: Can we flag risky deployments beforehand?

  • Learn from previous deployments to train the model

Conclusions

  • Automation is table stake
  • Using data from automation to drive progress isn't
    • Lots of low hanging fruits there
  • Unicorns are using "big data" effectively
    • How can the rest of us get there?

Moving 100,000 engineers to DevOps on the public cloud

Sam Guckenheimer from Microsoft talked how Microsoft transformed to using Azure DevOps and GitHub with a globally distributed 24x7x365 service on the public cloud. The session covered organizational and engineering practices in five areas.

Customer Obsession

  • Connect our customers directly and measure:
    • Direct feedback in product, visible on public site, and captured in backlog
  • Develop personal Connection and cadence
    • For top customers, have a "Champ" which maintain: Regular personal contact, long-term relationship, understanding customer desires
  • Definition of done: live in production, collecting telemetry that examines the hypothesis which motivated the deployment
Ship to learn

You Build It, You Love It

  • Live site incidents
    • Communicate externally and internally
    • Gather data for repair items & mitigate for customers
    • Record every action
    • Use repair items to prevent recurrence
  • Be transparent

Align outcomes, not outputs

  • You get what you measure (don't measure what you don't want)
    • Customer usage: acquisition, retention, engagement, etc.
    • Pipeline throughput: time to build, test, deploy, improve, failed and flaky automation, etc.
    • Service reliability: time to detect, communicate, mitigate; which customers affected, SLA per customer, etc.
    • "Don't" measure: original estimate, completed hours, lines of code, burndown, velocity, code coverage, bugs found, etc.
  • Good metrics are leading indicators
    • Trailing indicators: revenue, work accomplished, bugs found
    • Leading indicators: change in monthly growth rate of adoption, change in performance, change in time to learn, change in frequency of incidents
  • Measure outcomes not outputs

Get clean, stay clean

  • Progress follows a J-curve
    • Getting clean is highly manual
    • Staying clean requires dependable automation
  • Stay clean
    • Make technical debt visible on every team's dashboard

Your aim won't be perfect: Control the impact radius

  • Progressive exposure
    • Deploy one ring at a time: canary, data centers with small user counts, highest latency, th rest.
    • Feature flags control the access to new work: setting is per user within organization

Shift quality left and right

  • Pull requests control code merge to master
  • Pre-production test check every CI build

Monthly notes 49

Working From Home edition.

Issue 49, 27.3.2020

Conferences

Now that COVID-19 has all of us in corontine and working from home, also the technology conferences have been moved to online and free. Here's some.

WFHConf
Working From Home Conf with talks from technology to projects, best practices, lessons learned and about working from home. Agenda and recorded videos: part 1, part 2, part 3.

DEVOPS 2020, April 21 to 22
The Next Decade. The first day (main conference) has interesting talks from use cases and lessons learned, scaling devsecops, transformation journey and more.

MagnoliaJS, April 15-17
JavaScript heavy talks like component reusability, modern JavaScript for modern browsers, JavaScript’s exciting new features, how to supercharge teams and much more.

Red Hat Summit 2020, April 28-29

Tips and tricks

How to do effective video calls
tl;dr; TL;DR "Get good audio, use gallery view, mute if not talking, and welcome the cat."

Tools

Screen
Work together like you’re in the same room. Fast screen sharing with multiplayer control, drawing & video. (from @use_screen)

Kap
An open-source screen recorder built with web technology.

Setapp
The paramount collection of productive Mac apps.

Using NGINX Ingress Controller on Google Kubernetes Engine

If you've used Kubernetes you might have come across Ingress which manages external access to services in a cluster, typically HTTP. When running with GKE the "default" is GLBC which is a "load balancer controller that manages external loadbalancers configured through the Kubernetes Ingress API". It's easy to use but doesn't let you to to customize it much. The alternative is to use for example NGINX Ingress Controller which is more down to earth. Here are my notes of configuring ingress-nginx with cert manager on Google Cloud Kubernetes Engine.

This article takes much of it's content from the great tutorial at Digital Ocean.

Deploying ingress-nginx to GKE

Provider specific steps for installing ingress-nginx to GKE are quite simple.

First you need to initialize your user as a cluster-admin with the following command:

kubectl create clusterrolebinding cluster-admin-binding \
   --clusterrole cluster-admin \
   --user $(gcloud config get-value account)

Then if you are using a Kubernetes version previous to 1.14, you need to change kubernetes.io/os to beta.kubernetes.io/os at line 217 of mandatory.yaml.

Now you're ready to create mandatory resources, use kubectl apply and the -f flag to specify the manifest file hosted on GitHub:

kubectl apply -f https://raw.githubusercontent.com/kubernetes/ingress-nginx/nginx-0.30.0/deploy/static/mandatory.yaml

$ kubectl apply -f ingress-nginx_mandatory.yaml
namespace/ingress-nginx created
configmap/nginx-configuration created
configmap/tcp-services created
configmap/udp-services created
serviceaccount/nginx-ingress-serviceaccount created
clusterrole.rbac.authorization.k8s.io/nginx-ingress-clusterrole created
role.rbac.authorization.k8s.io/nginx-ingress-role created
rolebinding.rbac.authorization.k8s.io/nginx-ingress-role-nisa-binding created
clusterrolebinding.rbac.authorization.k8s.io/nginx-ingress-clusterrole-nisa-binding created
deployment.apps/nginx-ingress-controller created
limitrange/ingress-nginx created

Create the LoadBalancer Service:

$ kubectl apply -f https://raw.githubusercontent.com/kubernetes/ingress-nginx/nginx-0.30.0/deploy/static/provider/cloud-generic.yaml
service/ingress-nginx created

Verify installation:

$ kubectl get svc --namespace=ingress-nginx
NAME            TYPE           CLUSTER-IP      EXTERNAL-IP      PORT(S)                      AGE
ingress-nginx   LoadBalancer   10.10.10.1   1.1.1.1   80:30598/TCP,443:31334/TCP   40s

$ kubectl get pods --all-namespaces -l app.kubernetes.io/name=ingress-nginx --watch
NAMESPACE       NAME                                        READY   STATUS    RESTARTS   AGE
ingress-nginx   nginx-ingress-controller-6cb75cf6dd-f4cx7   1/1     Running   0          2m17s

Configure proxy settings

In some situations the payload for ingress-nginx might be too large and you have to increase it. Add the "nginx.ingress.kubernetes.io/proxy-body-size" annotation to your ingress metadata with value you need. 0 to not limit the body size.

apiVersion: networking.k8s.io/v1beta1
kind: Ingress
metadata:
  name: ingress
  annotations:
    kubernetes.io/ingress.class: nginx
    cert-manager.io/cluster-issuer: "letsencrypt-prod"
    nginx.ingress.kubernetes.io/proxy-body-size: "0"
    nginx.ingress.kubernetes.io/proxy-read-timeout: "600"
    nginx.ingress.kubernetes.io/proxy-send-timeout: "600"

Troubleshooting

Check the Ingress Resource Events:

$ kubectl get ing ingress-nginx

Check the Ingress Controller Logs:

$ kubectl get pods -n ingress-nginx
NAME                                        READY   STATUS    RESTARTS   AGE
nginx-ingress-controller-6cb75cf6dd-f4cx7   1/1     Running   0          149m

$ kubectl logs -n ingress-nginx nginx-ingress-controller-6cb75cf6dd-f4cx7

Check the Nginx Configuration:

kubectl exec -it -n ingress-nginx nginx-ingress-controller-6cb75cf6dd-f4cx7  cat /etc/nginx/nginx.conf

Check if used Services Exist:

kubectl get svc --all-namespaces

Promote ephemeral to static IP

If you want to keep the IP you got for the ingress-nginx then promote it to static. As we bound our ingress-nginx IP to a subdomain we want to retain that IP.

To promote the allocated IP to static, you can update the Service manifest:

kubectl --namespace=ingress-nginx patch svc ingress-nginx -p '{"spec": {"loadBalancerIP": "1.1.1.1"}}'

And promote the IP to static in GKE/GCE:

gcloud compute addresses create ingress-nginx --addresses 1.1.1.1 --region europe-north1

Creating the Ingress Resource

Creating your Ingress Resource to route traffic directed at a given subdomain to a corresponding backend Service and apply it to Kubernetes cluster.

$ kubectl apply -f ingress.yaml
ingress.extensions/ingress created

Verify installation:

kubectl get pods --all-namespaces -l app.kubernetes.io/name=ingress-nginx --watch

Installing and Configuring Cert-Manager

Next we'll install cert-manager into our cluster. It's a Kubernetes service that provisions TLS certificates from Let’s Encrypt and other certificate authorities and manages their lifecycles.

Create namespace:

kubectl create namespace cert-manager

install cert-manager and its Custom Resource Definitions (CRDs) like Issuers and ClusterIssuers.

kubectl apply --validate=false -f https://github.com/jetstack/cert-manager/releases/download/v0.13.1/cert-manager.yaml

Verify installation:

kubectl get pods --namespace cert-manager

Rolling Out Production Issuer

Create a production certificate ClusterIssuer, prod_issuer.yaml:

apiVersion: cert-manager.io/v1alpha2
kind: ClusterIssuer
metadata:
  name: letsencrypt-prod
  namespace: cert-manager
spec:
  acme:
    # The ACME server URL
    server: https://acme-v02.api.letsencrypt.org/directory
    # Email address used for ACME registration
    email: your-name@yourdomain.com
    # Name of a secret used to store the ACME account private key
    privateKeySecretRef:
      name: letsencrypt-prod
    # Enable the HTTP-01 challenge provider
    solvers:
    - http01:
        ingress:
          class: nginx

Apply production issuer using kubectl:

kubectl create -f prod_issuer.yaml

Update ingress.yml to use "letsencrypt-prod" issuer:

metadata:
  annotations:
    cert-manager.io/cluster-issuer: "letsencrypt-prod"

Apply the changes:

kubectl apply -f ingress.yaml

Verify that things look good:

kubectl describe ingress
kubectl describe certificate

Done;

Keep Maven dependencies up to date

Software development projects come usually with lots of dependencies and keeping them up to date can be burdensome if done manually. Fortunately there are tools to help you. For Node.js projects there are e.g. npm-check and npm-check-updates and for Maven projects there are OWASP/Dependency-Check and Versions Maven plugins. Here's a short introduction how to setup your Maven project to automatically check dependencies for vulnerabilities and if there's outdated dependencies.

OWASP/Dependency-Check

OWASP dependency-check is an open source solution the OWASP Top 10 2013 entry: "A9 - Using Components with Known Vulnerabilities".

Dependency-check can currently be used to scan Java and .NET applications to identify the use of known vulnerable components. The dependency-check plugin is, by default, tied to the verify or site phase depending on if it is configured as a build or reporting plugin.

The example below can be executed using mvn verify:

<project>
     ...
     <build>
         ...
         <plugins>
             ...
<plugin> 
    <groupId>org.owasp</groupId> 
    <artifactId>dependency-check-maven</artifactId> 
    <version>5.0.0-M3</version> 
    <configuration>
        <failBuildOnCVSS>8</failBuildOnCVSS>
        <skipProvidedScope>true</skipProvidedScope> 
        <skipRuntimeScope>true</skipRuntimeScope> 
    </configuration> 
    <executions> 
        <execution> 
            <goals> 
                <goal>check</goal> 
            </goals> 
        </execution> 
    </executions> 
</plugin>
            ...
         </plugins>
         ...
     </build>
     ...
</project>

The example fails the build for CVSS greater than or equal to 8 and skips scanning the provided and runtime scoped dependencies.

Versions Maven Plugin

The Versions Maven Plugin is the de facto standard way to manage versions of artifacts in a project's POM. From high-level comparisons between remote repositories up to low-level timestamp-locking for SNAPSHOT versions, its massive list of goals allows us to take care of every aspect of our projects involving dependencies.

The example configuration of versions-maven-plugin:

<plugin>
    <groupId>org.codehaus.mojo</groupId>
    <artifactId>versions-maven-plugin</artifactId>
    <version>2.7</version>
    <configuration>
        <allowAnyUpdates>false</allowAnyUpdates>
        <allowMajorUpdates>false</allowMajorUpdates>
        <allowMinorUpdates>false</allowMinorUpdates>
        <processDependencyManagement>false</processDependencyManagement>
    </configuration>
</plugin>

You could use goals that modify the pom.xml as described in the usage documentation but often it's easier to check versions manually  as you might not be able to update all of the suggested dependencies.

The display-dependency-updates goal will check all the dependencies used in your project and display a list of those dependencies with newer versions available.

Check new dependencies with:

mvn versions:display-dependency-updates

Check new plugin versions with:

mvn versions:display-plugin-updates

Summary

Using OWASP/Dependency-Check in your Continuous Integration build flow to automatically check dependencies for vulnerabilities and running periodically Versions Maven Plugin to check if there are outdated dependencies helps you to keep your project up to date and secure. Small but important things to remember while developing and maintaining a software project.

Monthly notes 48

This time monthly notes is for learning Node.js best practices and some interesting approaches for (Node.js) software architecture. Happy reading and be a better developer!

Issue 48, 25.2.2020

Learning

Docker and Node.js Best Practices talk at DockerCon 2019
Slides and Examples .
tl;dr; Use even numbered LTS releases; Don’t use :latest tag; Use Debian:slim/stretch or Alpine; Add node_modules to .dockerignore; Use node user; Proper shutdown (--init, tini, capture SIGINT); Multi-stage builds; healthchecks;

Node.js Best Practices
More than 80 best practices, style guides, and architectural tips with additional info. The repository is a summary and curation of the top-ranked content on Node.js best practices.

Testing in production: ideas, experiences, limits, roadblocks
Talk from Bristech 2019 by Jorge Marin. "Are you afraid of testing in production? Do you test in production? Do you use real data? By definition testing in production is hard. This talk puts together my experience testing in production a large scale system that affects millions of users."

Software Architecture

Using Clean Architecture for Microservice APIs in Node.js with MongoDB and Express
This is an interesting approach to construct your application. "Talk about Bob Martin's Clean Architecture model and I will show you how we can apply it to a Microservice built in node.js with MongoDB and Express JS."

Automate versioning and changelog with release-it on GitLab CI/CD

It’s said that you should automate all the things and one of the things could be versioning your software. Incrementing the version number in your e.g. package.json is easy but it’s easier when you bundle it to your continuous integration and continuous deployment process. There are different tools you can use to achieve your needs and in this article we are using release-it. Other options are for example standard-version and semantic-release.

🚀 Automate versioning and package publishing

Using release-it with CI/CD pipeline

Release It is a generic CLI tool to automate versioning and package publishing related tasks. It’s installation requires npm but package.json is not needed. With it you can i.a. bump version (in e.g. package.json), create git commit, tag and push, create release at GitHub or GitLab, generate changelog and make a release from any CI/CD environment.

Here is an example setup how to use release-it on Node.js project with Gitlab CI/CD.

Install and configure release-it

Install release-it with npm init release-it which ask you questions or manually with npm install --save-dev release-it .

For example the package.json can look the following where commit message has been customized to have "v" before version number and npm publish is disabled (although private: true should be enough for that). You could add [skip ci] to "commitMessage" for i.a. GitLab CI/CD to skip running pipeline on release commit or use Git Push option ci.skip.

package.json
{
  "name": “example-frontend",
  "version": "0.1.2",
  "private": true,
  "scripts": {
    ...
    "release": "release-it"
  },
  "dependencies": {
    …
  },
  "devDependencies": {
    ...
    "release-it": "^12.4.3”
},
"release-it": {
    "git": {
      "tagName": "v${version}",
      "requireCleanWorkingDir": false,
      "requireUpstream": false,
      "commitMessage": "Release v%s"
    },
    "npm": {
      "publish": false
    }
  }
}

Now you can run npm run release from the command line:

npm run release
npm run release -- patch --ci

In the latter command things are run without prompts (--ci) and patch increases the 0.0.x number.

Using release-it with GitLab CI/CD

Now it’s time to combine release-it with GitLab CI/CD. Adding release-it stage is quite straigthforward but you need to do couple of things. First in order to push the release commit and tag back to the remote, we need the CI/CD environment to be authenticated with the original host and we use SSH and public key for that. You could also use private token with HTTPS.

  1. Create SSH keys as we are using the Docker executorssh-keygen -t ed25519.
  2. Create a new SSH_PRIVATE_KEY variable in "project > repository > CI / CD Settings" where and paste the content of your private key that you created to the Value field.
  3. In your "project > repository > Repository" add new deploy key where Title is something describing and Key is the content of your public key that you created.
  4. Tap "Write access allowed".

Now you’re ready for git activity for your repository in CI/CD pipeline. Your .gitlab-ci.yaml release stage could look following.

image: docker:19.03.1

stages:
  - release

Release:
  stage: release
  image: node:12-alpine
  only:
    - master
  before_script:
    - apk add --update openssh-client git
    # Using Deploy keys and ssh for pushing to git
    # Run ssh-agent (inside the build environment)
    - eval $(ssh-agent -s)
    # Add the SSH key stored in SSH_PRIVATE_KEY variable to the agent store
    - echo "$SSH_PRIVATE_KEY" | tr -d '\r' | ssh-add -
    # Create the SSH directory and give it the right permissions
    - mkdir -p ~/.ssh
    - chmod 700 ~/.ssh
    # Don't verify Host key
    - '[[ -f /.dockerenv ]] && echo -e "Host *\n\tStrictHostKeyChecking no\n\n" > ~/.ssh/config'
    - git config user.email "gitlab-runner@your-domain.com"
    - git config user.name "Gitlab Runner"
  script:
    # See https://gist.github.com/serdroid/7bd7e171681aa17109e3f350abe97817
    # Set remote push URL
    # We need to extract the ssh/git URL as the runner uses a tokenized URL
    # Replace start of the string up to '@'  with git@' and append a ':' before first '/'
    - export CI_PUSH_REPO=$(echo "$CI_REPOSITORY_URL" | sed -e "s|.*@\(.*\)|git@\1|" -e "s|/|:/|" )
    - git remote set-url --push origin "ssh://${CI_PUSH_REPO}"
    # runner runs on a detached HEAD, checkout current branch for editing
    - git reset --hard
    - git clean -fd
    - git checkout $CI_COMMIT_REF_NAME
    - git pull origin $CI_COMMIT_REF_NAME
    # Run release-it to bump version and tag
    - npm ci
    - npm run release -- patch --ci --verbose

We are running release-it here with patch increment. If you want to skip CI pipeline on release-it commit you can either use the ci.skip Git Push option package.json git.pushArgs which tells GitLab CI/CD to not create a CI pipeline for the latest push. This way we don't need to add [skip ci] to commit message.

And now you're ready to run the pipeline with release stage and enjoy of automated patch updates to your application's version number. And you also get GitLab Releases if you want.

Setting up the script step was not so clear but fortunately people in the Internet had done it earlier and Google found a working gist and comment on GitLab issue. Interacting with git in the GitLab CI/CD could be easier and there are some feature requests for that like allowing runners to push via their CI token.

Customizing when pipelines are run

There are some more options for GitLab CI/CD pipelines if you want to run pipelines after you've tagged your version. Here's snippet of running "release" stage on commits to master branch and skipping it if commit message is for release.

Release:
  stage: release
  image: node:12-alpine
  only:
    refs:
      - master
    variables:
      # Run only on master and commit message doesn't start with "Release v"
      - $CI_COMMIT_MESSAGE !~ /^Release v.*/
  before_script:
    ...
  script:
    ...

Now we can build a new container for the deployment of our application after it has been tagged and version bumped. Also we are reading the package.json version for tagging the image.

variables:
  PACKAGE_VERSION: $(cat package.json | grep version | head -1 | awk -F= "{ print $2 }" | sed 's/[version:,\",]//g' | tr -d '[[:space:]]')

Build dev:
  before_script:
    - export VERSION=`eval $PACKAGE_VERSION`
  stage: build
  script:
    - >
      docker build
      --pull
      --tag your-docker-image:latest
      --tag your-docker-image:$VERSION.dev
      .
    - docker push your-docker-image:latest
    - docker push your-docker-image:$VERSION.dev
  only:
    refs:
      - master
    variables:
      # Run only on master and commit message starts with "Release v"
      - $CI_COMMIT_MESSAGE =~ /^Release v.*/

Using release-it on detached HEAD

In the previous example we made a checkout to current branch for editing as the runner runs on detached HEAD. You can use the detached HEAD as shown below but the downside is that you can't create GitLab Releases from the pipeline as it fails to "ERROR Response code 422 (Unprocessable Entity)". This is because (I suppose) it doesn't make git push as it's done in manually with git.

Then the .gitlab-ci.yml is following:

...
script:
    - export CI_PUSH_REPO=$(echo "$CI_REPOSITORY_URL" | sed -e "s|.*@\(.*\)|git@\1|" -e "s|/|:/|" )
    - git remote set-url --push origin "ssh://${CI_PUSH_REPO}"
    # gitlab-runner runs on a detached HEAD, create a temporary local branch for editing
    - git checkout -b ci
    # Run release-it to bump version and tag
    - npm ci
    - DEBUG=release-it:* npm run release -- patch --ci --verbose --no-git.push
    # Push changes to originating branch
    # Always return true so that the build does not fail if there are no changes
    - git push --follow-tags origin ci_processing:${CI_COMMIT_REF_NAME} || true

Reset Hasura migrations and squash files

Using GraphQL for creating REST APIs is nowadays popular and there are different tools you can use. One of them is Hasura which is an open-source engine that gives you realtime GraphQL APIs on new or existing Postgres databases. Hasura is quite easy to work with but if your GraphQL schemas change a lot it creates plentiful of migration files. This has some unwanted consequences (for example slowing down the hasura migrate apply or even blocking it). Here’s some notes how to reset the state and create new migrations from the state that is on the server.

Note: From Hasura 1.0.0 onwards squashing is easier with hasura migrate squash command. It's still in preview. But before Hasura 1.0.0 version you have to squash migrations manually and this blog post explains how. The results are the same: squashing multiple migrations into a single one.

Hasura documentation provides a good guide how to squash migrations but in practice there are couple of other things you may need to address. So let’s combine the steps Hasura gives and some extra steps.

Reset Hasura migrations

First make a backup branch:

  1. $ git checkout master
  2. Create a backup branch:
    $ git checkout -b backup/migrations-before-resetting-20XX-XX-XX
  3. Update the backup branch to origin:
    $ git push origin backup/migrations-before-resetting-20XX-XX-XX

We are assuming you've local Hasura running on Docker with something like the following docker-compose.yml

version: "3.6"
services:
  postgres:
    image: postgres:11-alpine
    restart: always
    ports:
      - "5432:5432"
    volumes:
      - db_data:/var/lib/postgresql/data
    command: postgres -c max_locks_per_transaction=2000
  graphql-engine:
    image: hasura/graphql-engine:v1.0.0-beta.6
    ports:
      - "8080:8080"
    depends_on:
      - "postgres"
    restart: always
    environment:
      HASURA_GRAPHQL_DATABASE_URL: postgres://postgres:@postgres:5432/postgres
      HASURA_GRAPHQL_ENABLE_CONSOLE: "true" # set to "false" to disable console
      HASURA_GRAPHQL_ADMIN_SECRET: changeme
      HASURA_GRAPHQL_ENABLED_LOG_TYPES: startup, http-log, webhook-log, websocket-log, query-log
volumes:
  db_data:

Create local instance of Hasura with up to date migrations:

  1. $ docker-compose down -v
  2. $ docker-compose up
  3. $ hasura migrate apply --endpoint=http://localhost:8080 --admin-secret=changeme

Reset migrations to master:

  1. git checkout master
  2. git checkout -b reset-hasura-migrations
  3. rm -rf migrations/*

Reset the migration history on server. On hasura SQL console, http://localhost:8080/console:

TRUNCATE hdb_catalog.schema_migrations;

Setup fresh migrations by taking the schema and metadata from the server. By default init only takes public schema if others not mentioned with the --schema "your schema" parameter. Note down the version for later use.

  1. Create migration file:
    $ hasura migrate create "init" --from-server
  2. Mark the migration as applied on this server:
    $ hasura migrate apply --version "" --skip-execution
  3. Verify status of migrations, should show only one migration with Present status:
    $ hasura migrate status
  4. You have brand new migrations now!

Resetting migrations on other environments

  1. Checkout the reset branch on local machine:
    $ git checkout -b reset-hasura-migrations
  2. Reset the migration history on remote server. On Hasura SQL console:
    TRUNCATE hdb_catalog.schema_migrations;
  3. Apply migration status to remote server:
    $ hasura migrate apply --version "<version>" --skip-execution

Local environment Hasura status

For other developers please refer these instructions in order to get the backend into same state.

Option 1: Keep old data

  1. Checkout the backup branch on local machine:
    $ git checkout backup/migrations-before-resetting-20XX-XX-XX
  2. Reset the migration history on local server. On Hasura SQL console:
    TRUNCATE hdb_catalog.schema_migrations;
  3. Apply migration status to local server:
    $ hasura migrate apply --version "<version>" --skip-execution

Option 2: Remove all and start from beginning

  1. Clean up the old docker volumes:
    $ docker-compose down -v
  2. Start up services:
    $ docker-compose up
  3. Checkout master:
    $ git checkout master
  4. Apply migrations:
    $ hasura migrate apply --endpoint=http://localhost:8080 --admin-secret=changeme

Possible extra steps

Now your Hasura migrations and database tables are in one migration init file but sometimes things don’t work out when applying it to empty database. We are using Hasura audit-trigger and had to reorder the SQL clauses done by the migrate init and add some missing parts.

  1. Move schema creations after audit clauses
  2. Move audit.audit_table(target_table regclass) to last audit clause and copy it from audit.sql
  3. Add pg_trgm extension as done previously (fixes "operator does not exist: text <%!t(MISSING)ext" in public.search_customers_by_name)
  4. Drop session constraints / index before creating new
  5. Create session table only if not exists

Monthly notes 47

Issue 47: 30.1.2020

War Stories

#Y2038 problem. "It's *already here*. Fix your stuff."
In many systems time is represented as number of seconds passed since 00:00:00 UTC on 1 Jan 1970 and stored as signed 32-bit integer. Such implementations can't encode times after 03:14:07 UTC on 19 January 2038. (from @walokra)

Ops Lessons We All Learn The Hard Way
Good Twitter thread of lessons learned on Ops.

Web

Front-End Performance Checklist 2020
Great resource to read for better front-end performance. Remember, if you don’t measure it, you can’t improve it 🚀 To get you started: use i.a. page speed and lighthouse to see where you stand.

JavaScript

20 ways to become better node.js developer in 2020
tl;dr; Sleep more; Use Jest & Ava; GraphQL!; Check Nest.js; Gradual deployment; Test in production? Learn Docker & Kubernetes; Read vulnerable code; Use monitoring; CI with quality tools;

Kubernetes

How Soon We Forget: Security in the Age of Docker & Kubernetes
Good starting point for hardening your containers and Kubernetes cluster. tl;dr; Running as non-root. Read-only file system. Not terminating TLS too soon. Setting resources limits (Denial of service). Use Kubernetes policies. (from nicolas_frankel)

Software Development

Goodbye, Clean Code
AHA! Avoid Hasty Abstractions. Prefer duplication over the wrong abstraction. Check also Dan’s tweet’s thread.

Remote working tips
tl;dr; the thread: 1) activity signal to start work day 2) frequent small breaks 3) dedicate a space for work (not sofa/bed) 4) take sick days 5) connect with other humans 6) non-project connection point with peers 7) block out distractions 8) go out for lunch.

Tools

tiny-helpers.dev
Collection of useful single-purpose online tools that are useful for web devs. (from @stefanjudis)

Insomnia
"Debug APIs like a human, not a robot". If you don't like Postman then try Insomnia which says to be powerful HTTP and GraphQL tool belt and open source. Seems that it doesn't have similar scripting and testing features as Postman though.

Something different

Finding the best bicycle chain
tl;dr; No major reason for not to use drivetrain manufacturer’s recommended chain. The lubricant you use will play the most critical role in drivetrain durability. Run a good lube and keep your drivetrain clean.