10 Best DevOps Tools for Start-ups

Author's note: the opinions and thoughts expressed in this blog post (including but not limited to the choice of tools, reviews/comments on tools, comparisons, etc.) reflect only the author's views, which are my own and don't represent either DevStream or my employer's opinion.

OK. You are in a small start-up, and you want to move fast. To move fast, you will need automation instead of doing stuff manually. So, it would be best if you had a bunch of DevOps tools that can accelerate your software development lifecycle (SDLC.)

The thing is, though, hundreds of tools can help you increase engineering productivity.

Which ones to choose? That's the million-dollar question.

But worry no more; I've got you covered. Fasten your seatbelt and read on because this blog is about maximizing your SDLC speed.


1 Terraform

Before writing the first line of code, you probably need to decide on the infrastructure to run your workload. So, let's start with infra.

You've got two options:

  • on-premise (rent/build your own data center)
  • cloud

If you are a small start-up, chances are, you can't afford to build or rent data centers. Also, start-ups, by definition, are fast-moving entities that focus on speed over many things else; hence you probably can't afford the operational overhead (not to mention financial overhead) of managing physical infrastructure.

So, the cloud is your obvious choice.

And this leaves me no option but to start with the elephant in the room: Terraform.

Nowadays, you probably can't talk about the cloud without talking about Terraform. Starting in 2014, it has been used by many people. Start-ups, global corps, you name it, they use it. Terraform defines your infrastructure as code so that it can generate the same result every time you execute it. Combined with the cloud, you can have your infrastructure up and running in no time (and tear them down immediately when you don't need them anymore.)

To know more about infrastructure as code and Terraform, read my blogs:


2 Kubernetes

Now you've got your infrastructure. What next?

Yep, containerized workload.

A widely accepted methodology for constructing cloud-native applications is the Twelve-Factor Application, which describes principles and practices to build cloud-optimized apps. Systems built upon these principles can deploy and scale rapidly and add features to react quickly to market changes.

Of the 12 factors, special attention is given to portability (across environments, declarative automation) and disposability. Your workload (service instances) should be disposable, favor fast start-up to increase scalability opportunities, and graceful shutdowns to leave the system in a correct state.

Docker containers (along with an orchestrator) inherently satisfy this requirement. So, Kubernetes has become the de-facto place to run your cloud-native workload.

Also known as K8s, Kubernetes is an open-source system for automating containerized applications' deployment, scaling, and management.

No matter which cloud you are using, there is a Kubernetes service for you to create a cluster quickly. You can use Terraform for that; Some cloud providers even provide tools to easily create a cluster with a config file at the click of the Enter key. For example, AWS EKS has eksctl for this.

When your services grow more and more, you probably need a service mesh to add an extra layer of security, observability, and reliability to Kubernetes. The most famous option is probably Istio, but it's not our recommendation here. Istio is a bit too complicated. As a fast-moving start-up, the recommendation is Linkerd.

Linkerd is still a service mesh, but it's ultralight (hence fast) and simple. You can have it up and running with a command within a few minutes, adding minimum operational overhead.


3 Trello

Now that we've sorted out the cloud infrastructure out, let's move on to project management.

For start-ups, you probably will do agile software development, and the most famous choices of agile frameworks probably are Scrum and Kanban. Compared to Scrum, Kanban has fewer "rituals" (meetings,) hence less operational overhead.

No matter your choice, you need a synchronized tool to track your work in progress. This is where issue and project tracking tools like Jira come in.

However, Jira is relatively too complicated. You can definitely get more value out of it if you know how to utilize all the quirks and features, but to get quickly started, a free version of Trello should suffice.

Alternatively, you can use GitHub Projects in your GitHub repository.


4 GitHub + GitHub Actions

OK, now we've got a place to manage our project and track issues; let's code.

Where to store your source code? You need a source code management system, and GitHub is a no-brainer.

Plus, you can integrate Trello with GitHub. For example, suppose you are working on an open-source project where your users would raise issues in GitHub issues, but you need to track them in Trello. In that case, you can automatically synchronize those GitHub issues to Trello tickets.

One of the reasons that we picked GitHub as our source code management system is because of GitHub Actions.

GitHub Actions makes it easy to automate all your software workflows. Build, test, and even deploy your code right from your source code management system. Although it can do continuous deployment (CD), we only intend to use it as our continuous integration (CI) system here.

One benefit of using GitHub Actions as your CI is, by definition, CI interacts with your code a lot. And if your code and your CI are the same systems, you will save yourself a lot of trouble of integrating your code repositories into your CI systems. No more overhead configuring authentication, authorization, webhooks, or what have you.

If, for example, you decide to deploy something like Jenkins or Tekton and use it as your choice of CI, they would take up some resources of your infrastructure. On the contrary, GitHub Actions has some free quota, so when you just started your company, probably it's more than enough, and it costs you nothing at all, no need to register some extra self-hosted runners to it at all.

For an introduction to CI and how to choose the right CI for you, see my blog post An Introduction to CI. It's an old article, but the principles apply.


5 Argo CD

Once you have your code and the continuous integration workflow set up, you need to deploy your apps, hopefully in an automated fashion. That is what continuous deployment is for: the software is delivered through automated deployments.

The GitOps pattern is a continuous deployment practice using Git repositories as the source of truth for defining the desired application state. Application definitions, configurations, and environments should be declarative and version controlled; application deployment and lifecycle management should be automated, auditable, and easy to understand.

Argo CD is a declarative, continuous delivery tool for GitOps in Kubernetes. It automates the deployment of the desired application states in the specified target environments.

For more information on continuous deployment and GitOps, see my blog posts:

If you prefer other tools for GitOps, there is also Flux CD.


6 Doppler

Every stage of the software development lifecycle is intertwined with secrets and credentials.

When you launch a virtual machine, you might need to provide an initial password.

When you are developing an app, the app itself might need access to a database, so it needs the password. Or maybe the app talks to another API that requires authentication, so you need a token, which is a secret.

In your CI system, you might want to write back something to your version control system (for example, build status, tags, etc.), so you have to create a user for your CI and save your password somewhere safe so that your CI system can read it.

In your CD system, you might want to SSH into a machine using a private key to do deployment; or maybe you don't use virtual machines, but instead, you have containers running in a Kubernetes cluster, in which case, you still need to manage the access from your CD system to your Kubernetes cluster.

Secrets are involved in every stage of the software development cycle, which is why you need to manage them properly.

Doppler enables developers and security teams to keep their secrets and app configuration in sync and secure across devices, environments, and team members. Say goodbye to .env files.

Other choices include HashiCorp Vault, AWS Secrets Manager, etc., but the features differ. The lesser-known Doppler (compared to HashiCorp Vault) is recommended here because of its great Kubernetes integration, where secrets can be synchronized to Kubernetes as native secrets, and the update of a secret in Doppler can trigger a redeployment of your K8s app. For more information on Doppler, see my blog post Doppler: A Brief Introduction to Secrets Managers.

If you only need to share secrets within a team, there are other choices, like SOPS. For more information on SOPS, see my blog post A Comprehensive Guide to SOPS: Managing Your Secrets Like A Visionary, Not a Functionary.

For more information on secrets managers, see my blog posts:


7 Trivy

Since we heavily rely on container images for our cloud-native workload, image security has become a more and more important topic.

Container images play a crucial role in container security. Any container created from an image inherits all its characteristics—including security vulnerabilities, misconfigurations, or even malware.

Trivy is a security scanner. It is reliable, fast, effortless, and works wherever you need it. Trivy has different scanners that look for various security issues, and the most famous use case is for container image Known vulnerabilities (CVEs) scanning.

You can run it as a CLI tool locally to scan your local container image and other artifacts before pushing it to a container registry or deploying your application.

What's more, What's more, Trivy is designed to be used in CI and can be easily integrated with your CI pipelines.


8 Prometheus + Grafana + Loki

Prometheus is probably the most famous open-source systems monitoring and alerting toolkit. Since its inception in 2012, many companies and organizations have adopted Prometheus, and the project has a very active developer and user community.

Prometheus collects and stores its metrics as time series data, i.e., metrics information is stored with the timestamp at which it was recorded.

Grafana, on the other hand, enables you to query, visualize, alert on, and explore your metrics, logs and traces wherever they are stored.

The Prometheus/Grafana combo probably is the most famous centralized monitoring tool. The Prometheus Operator, which manages Prometheus clusters atop Kubernetes, is recommended to install this combo easily. See the open-source project kube-prometheus for more info.

After we get our centralized monitoring system, we need centralized logging as well. The ELK stack is a popular choice for that, but since we've already got Grafana, here Loki is recommended.

Loki is a log aggregation system designed to store and query logs from all your applications and infrastructure, and it can show logs inside Grafana. You can use Grafana to see both logs and monitoring metrics!

Loki is a horizontally scalable, highly available, multi-tenant log aggregation system inspired by Prometheus. It is designed to be very cost-effective and easy to operate. It does not index the contents of the logs but rather a set of labels for each log stream. Since we've already got Grafana, Loki seems to be a more logical choice compared to installing another tech stack, which is the ELK.


9 Jaeger

We've already got centralized logging and monitoring. Why Jaeger?

For modern cloud-native, distributed microservice architecture workload, most operational problems are ultimately grounded in networking and observability.

Compared to the old monolithic application, it is an order of magnitude larger problem to network and debug a set of intertwined distributed services.

So, logging and monitoring alone won't be enough; we need to be able to monitor distributed transactions, optimize performance and latencies, and analyze root causes.

And this is precisely where an end-to-end distributed tracing tool like Jaeger kicks in: it helps monitor and troubleshoot transactions in complex distributed systems.


10 Opsgenie

OK, now we have everything to run our workload and to make them observable, the last but not least is incident response lifecycle and incident management.

Incident management tools try to solve one issue: acting as a single source of truth for alerts, they centralize alerts, and only the right people would be notified at the right time.

You can define your own on-call schedules and routing rules to fit any workflow, and you will never miss a critical alert.

Popular choices include Opsgenie and PagerDuty, and here Opsgenie is recommended. It's only a personal preference; Opsgenie seems prettier and more user-friendly, and it has all the integrations you need.


Summary

OK, here we have them: 10 easy-to-use, open-source (mostly) DevOps tools to boost start-ups' SDLC:

  1. Infrastructure as Code: Terraform
  2. Infrastructure: Cloud/Kubernetes/Linkerd
  3. Project/Issue Tracking: Trello
  4. Source Code Management + CI: GitHub + GitHub Actions
  5. CD/GitOps: Argo CD
  6. Secrets Manager: Doppler
  7. Container Image Security: Trivy
  8. Centralized Logging/Monitoring: Prometheus + Grafana + Loki
  9. Centralized Tracing/Observability: Jaeger
  10. Incident Management: Opsgenie

If you like this article, please like, comment, and subscribe.

Soon, I'll write about open-source DevOps tools and newly emerged tools.

Stay tuned!

©著作权归作者所有,转载或内容合作请联系作者
  • 序言:七十年代末,一起剥皮案震惊了整个滨河市,随后出现的几起案子,更是在滨河造成了极大的恐慌,老刑警刘岩,带你破解...
    沈念sama阅读 216,258评论 6 498
  • 序言:滨河连续发生了三起死亡事件,死亡现场离奇诡异,居然都是意外死亡,警方通过查阅死者的电脑和手机,发现死者居然都...
    沈念sama阅读 92,335评论 3 392
  • 文/潘晓璐 我一进店门,熙熙楼的掌柜王于贵愁眉苦脸地迎上来,“玉大人,你说我怎么就摊上这事。” “怎么了?”我有些...
    开封第一讲书人阅读 162,225评论 0 353
  • 文/不坏的土叔 我叫张陵,是天一观的道长。 经常有香客问我,道长,这世上最难降的妖魔是什么? 我笑而不...
    开封第一讲书人阅读 58,126评论 1 292
  • 正文 为了忘掉前任,我火速办了婚礼,结果婚礼上,老公的妹妹穿的比我还像新娘。我一直安慰自己,他们只是感情好,可当我...
    茶点故事阅读 67,140评论 6 388
  • 文/花漫 我一把揭开白布。 她就那样静静地躺着,像睡着了一般。 火红的嫁衣衬着肌肤如雪。 梳的纹丝不乱的头发上,一...
    开封第一讲书人阅读 51,098评论 1 295
  • 那天,我揣着相机与录音,去河边找鬼。 笑死,一个胖子当着我的面吹牛,可吹牛的内容都是我干的。 我是一名探鬼主播,决...
    沈念sama阅读 40,018评论 3 417
  • 文/苍兰香墨 我猛地睁开眼,长吁一口气:“原来是场噩梦啊……” “哼!你这毒妇竟也来了?” 一声冷哼从身侧响起,我...
    开封第一讲书人阅读 38,857评论 0 273
  • 序言:老挝万荣一对情侣失踪,失踪者是张志新(化名)和其女友刘颖,没想到半个月后,有当地人在树林里发现了一具尸体,经...
    沈念sama阅读 45,298评论 1 310
  • 正文 独居荒郊野岭守林人离奇死亡,尸身上长有42处带血的脓包…… 初始之章·张勋 以下内容为张勋视角 年9月15日...
    茶点故事阅读 37,518评论 2 332
  • 正文 我和宋清朗相恋三年,在试婚纱的时候发现自己被绿了。 大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
    茶点故事阅读 39,678评论 1 348
  • 序言:一个原本活蹦乱跳的男人离奇死亡,死状恐怖,灵堂内的尸体忽然破棺而出,到底是诈尸还是另有隐情,我是刑警宁泽,带...
    沈念sama阅读 35,400评论 5 343
  • 正文 年R本政府宣布,位于F岛的核电站,受9级特大地震影响,放射性物质发生泄漏。R本人自食恶果不足惜,却给世界环境...
    茶点故事阅读 40,993评论 3 325
  • 文/蒙蒙 一、第九天 我趴在偏房一处隐蔽的房顶上张望。 院中可真热闹,春花似锦、人声如沸。这庄子的主人今日做“春日...
    开封第一讲书人阅读 31,638评论 0 22
  • 文/苍兰香墨 我抬头看了看天上的太阳。三九已至,却和暖如春,着一层夹袄步出监牢的瞬间,已是汗流浃背。 一阵脚步声响...
    开封第一讲书人阅读 32,801评论 1 268
  • 我被黑心中介骗来泰国打工, 没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留,地道东北人。 一个月前我还...
    沈念sama阅读 47,661评论 2 368
  • 正文 我出身青楼,却偏偏与公主长得像,于是被迫代替她去往敌国和亲。 传闻我的和亲对象是个残疾皇子,可洞房花烛夜当晚...
    茶点故事阅读 44,558评论 2 352

推荐阅读更多精彩内容