website: vs page

This commit is contained in:
Armon Dadgar 2014-07-25 10:51:30 -04:00
parent a42661004d
commit 0850e5691c
6 changed files with 7 additions and 259 deletions

View File

@ -6,11 +6,13 @@ sidebar_current: "vs-other"
# Terraform vs. Other Software
The problems Terraform solves are varied, but each individual feature has been
solved by many different systems. Although there is no single system that provides
all the features of Terraform, there are other options available to solve some of these problems.
In this section, we compare Terraform to some other options. In most cases, Terraform is not
mutually exclusive with any other system.
Terraform provides a flexible abstraction of resources and providers. This model
allows for representing everything from physical hardware, virtual machines,
containers, to email and DNS providers. Because of this flexibility, Terraform
can be used to solve many different problems. This means there are a number of
exiting tools that overlap with the capabilities of Terraform. We compare Terraform
to a numbers of these tools, but it's good to note that Terraform is not mutual
exclusive with any other system, nor does it require total buy-in to be useful.
Use the navigation to the left to read the comparison of Terraform to specific
systems.

View File

@ -1,49 +0,0 @@
---
layout: "intro"
page_title: "Terraform vs. Nagios, Sensu"
sidebar_current: "vs-other-nagios-sensu"
---
# Terraform vs. Nagios, Sensu
Nagios and Sensu are both tools built for monitoring. They are used
to quickly notify operators when an issue occurs.
Nagios uses a group of central servers that are configured to perform
checks on remote hosts. This design makes it difficult to scale Nagios,
as large fleets quickly reach the limit of vertical scaling, and Nagios
does not easily scale horizontally. Nagios is also notoriously
difficult to use with modern DevOps and configuration management tools,
as local configurations must be updated when remote servers are added
or removed.
Sensu has a much more modern design, relying on local agents to run
checks and pushing results to an AMQP broker. A number of servers
ingest and handle the result of the health checks from the broker. This model
is more scalable than Nagios, as it allows for much more horizontal scaling,
and a weaker coupling between the servers and agents. However, the central broker
has scaling limits, and acts as a single point of failure in the system.
Terraform provides the same health checking abilities as both Nagios and Sensu,
is friendly to modern DevOps, and avoids the scaling issues inherent in the
other systems. Terraform runs all checks locally, like Sensu, avoiding placing
a burden on central servers. The status of checks is maintained by the Terraform
servers, which are fault tolerant and have no single point of failure.
Lastly, Terraform can scale to vastly more checks because it relies on edge triggered
updates. This means that an update is only triggered when a check transitions
from "passing" to "failing" or vice versa.
In a large fleet, the majority of checks are passing, and even the minority
that are failing are persistent. By capturing changes only, Terraform reduces
the amount of networking and compute resources used by the health checks,
allowing the system to be much more scalable.
An astute reader may notice that if a Terraform agent dies, then no edge triggered
updates will occur. From the perspective of other nodes all checks will appear
to be in a steady state. However, Terraform guards against this as well. The
[gossip protocol](/docs/internals/gossip.html) used between clients and servers
integrates a distributed failure detector. This means that if a Terraform agent fails,
the failure will be detected, and thus all checks being run by that node can be
assumed failed. This failure detector distributes the work among the entire cluster,
and critically enables the edge triggered architecture to work.

View File

@ -1,46 +0,0 @@
---
layout: "intro"
page_title: "Terraform vs. Serf"
sidebar_current: "vs-other-serf"
---
# Terraform vs. Serf
[Serf](http://www.serfdom.io) is a node discovery and orchestration tool and is the only
tool discussed so far that is built on an eventually consistent gossip model,
with no centralized servers. It provides a number of features, including group
membership, failure detection, event broadcasts and a query mechanism. However,
Serf does not provide any high-level features such as service discovery, health
checking or key/value storage. To clarify, the discovery feature of Serf is at a node
level, while Terraform provides a service and node level abstraction.
Terraform is a complete system providing all of those features. In fact, the internal
[gossip protocol](/docs/internals/gossip.html) used within Terraform, is powered by
the Serf library. Terraform leverages the membership and failure detection features,
and builds upon them.
The health checking provided by Serf is very low level, and only indicates if the
agent is alive. Terraform extends this to provide a rich health checking system,
that handles liveness, in addition to arbitrary host and service-level checks.
Health checks are integrated with a central catalog that operators can easily
query to gain insight into the cluster.
The membership provided by Serf is at a node level, while Terraform focuses
on the service level abstraction, with a single node to multiple service model.
This can be simulated in Serf using tags, but it is much more limited, and does
not provide useful query interfaces. Terraform also makes use of a strongly consistent
Catalog, while Serf is only eventually consistent.
In addition to the service level abstraction and improved health checking,
Terraform provides a key/value store and support for multiple datacenters.
Serf can run across the WAN but with degraded performance. Terraform makes use
of [multiple gossip pools](/docs/internals/architecture.html), so that
the performance of Serf over a LAN can be retained while still using it over
a WAN for linking together multiple datacenters.
Terraform is opinionated in its usage, while Serf is a more flexible and
general purpose tool. Terraform uses a CP architecture, favoring consistency over
availability. Serf is a AP system, and sacrifices consistency for availability.
This means Terraform cannot operate if the central servers cannot form a quorum,
while Serf will continue to function under almost all circumstances.

View File

@ -1,41 +0,0 @@
---
layout: "intro"
page_title: "Terraform vs. SkyDNS"
sidebar_current: "vs-other-skydns"
---
# Terraform vs. SkyDNS
SkyDNS is a relatively new tool designed to solve service discovery.
It uses multiple central servers that are strongly consistent and
fault tolerant. Nodes register services using an HTTP API, and
queries can be made over HTTP or DNS to perform discovery.
Terraform is very similar, but provides a superset of features. Terraform
also relies on multiple central servers to provide strong consistency
and fault tolerance. Nodes can use an HTTP API or use an agent to
register services, and queries are made over HTTP or DNS.
However, the systems differ in many ways. Terraform provides a much richer
health checking framework, with support for arbitrary checks and
a highly scalable failure detection scheme. SkyDNS relies on naive
heartbeating and TTLs, which have known scalability issues. Additionally,
the heartbeat only provides a limited liveness check, versus the rich
health checks that Terraform is capable of.
Multiple datacenters can be supported by using "regions" in SkyDNS,
however the data is managed and queried from a single cluster. If servers
are split between datacenters the replication protocol will suffer from
very long commit times. If all the SkyDNS servers are in a central datacenter, then
connectivity issues can cause entire datacenters to lose availability.
Additionally, even without a connectivity issue, query performance will
suffer as requests must always be performed in a remote datacenter.
Terraform supports multiple datacenters out of the box, and it purposely
scopes the managed data to be per-datacenter. This means each datacenter
runs an independent cluster of servers. Requests are forwarded to remote
datacenters if necessary. This means requests for services within a datacenter
never go over the WAN, and connectivity issues between datacenters do not
affect availability within a datacenter. Additionally, the unavailability
of one datacenter does not affect the service discovery of services
in any other datacenter.

View File

@ -1,57 +0,0 @@
---
layout: "intro"
page_title: "Terraform vs. SmartStack"
sidebar_current: "vs-other-smartstack"
---
# Terraform vs. SmartStack
SmartStack is another tool which tackles the service discovery problem.
It has a rather unique architecture, and has 4 major components: ZooKeeper,
HAProxy, Synapse, and Nerve. The ZooKeeper servers are responsible for storing cluster
state in a consistent and fault tolerant manner. Each node in the SmartStack
cluster then runs both Nerves and Synapses. The Nerve is responsible for running
health checks against a service, and registering with the ZooKeeper servers.
Synapse queries ZooKeeper for service providers and dynamically configures
HAProxy. Finally, clients speak to HAProxy, which does health checking and
load balancing across service providers.
Terraform is a much simpler and more contained system, as it does not rely on any external
components. Terraform uses an integrated [gossip protocol](/docs/internals/gossip.html)
to track all nodes and perform server discovery. This means that server addresses
do not need to be hardcoded and updated fleet wide on changes, unlike SmartStack.
Service registration for both Terraform and Nerves can be done with a configuration file,
but Terraform also supports an API to dynamically change the services and checks that are in use.
For discovery, SmartStack clients must use HAProxy, requiring that Synapse be
configured with all desired endpoints in advance. Terraform clients instead
use the DNS or HTTP APIs without any configuration needed in advance. Terraform
also provides a "tag" abstraction, allowing services to provide metadata such
as versions, primary/secondary designations, or opaque labels that can be used for
filtering. Clients can then request only the service providers which have
matching tags.
The systems also differ in how they manage health checking.
Nerve's performs local health checks in a manner similar to Terraform agents.
However, Terraform maintains separate catalog and health systems, which allow
operators to see which nodes are in each service pool, as well as providing
insight into failing checks. Nerve simply deregisters nodes on failed checks,
providing limited operator insight. Synapse also configures HAProxy to perform
additional health checks. This causes all potential service clients to check for
liveness. With large fleets, this N-to-N style health checking may be prohibitively
expensive.
Terraform generally provides a much richer health checking system. Terraform supports
Nagios style plugins, enabling a vast catalog of checks to be used. It also
allows for service and host-level checks. There is also a "dead man's switch"
check that allows applications to easily integrate custom health checks. All of this
is also integrated into a Health and Catalog system with APIs enabling operator
to gain insight into the broader system.
In addition to the service discovery and health checking, Terraform also provides
an integrated key/value store for configuration and multi-datacenter support.
While it may be possible to configure SmartStack for multiple datacenters,
the central ZooKeeper cluster would be a serious impediment to a fault tolerant
deployment.

View File

@ -1,61 +0,0 @@
---
layout: "intro"
page_title: "Terraform vs. ZooKeeper, doozerd, etcd"
sidebar_current: "vs-other-zk"
---
# Terraform vs. ZooKeeper, doozerd, etcd
ZooKeeper, doozerd and etcd are all similar in their architecture.
All three have server nodes that require a quorum of nodes to operate (usually a simple majority).
They are strongly consistent, and expose various primitives that can be used
through client libraries within applications to build complex distributed systems.
Terraform works in a similar way within a single datacenter with only server nodes.
In each datacenter, Terraform servers require a quorum to operate
and provide strong consistency. However, Terraform has native support for multiple datacenters,
as well as a more complex gossip system that links server nodes and clients.
If any of these systems are used for pure key/value storage, then they all
roughly provide the same semantics. Reads are strongly consistent, and availability
is sacrificed for consistency in the face of a network partition. However, the differences
become more apparent when these systems are used for advanced cases.
The semantics provided by these systems are attractive for building
service discovery systems. ZooKeeper et al. provide only a primitive K/V store,
and require that application developers build their own system to provide service
discovery. Terraform provides an opinionated framework for service discovery, and
eliminates the guess work and development effort. Clients simply register services
and then perform discovery using a DNS or HTTP interface. Other systems
require a home-rolled solution.
A compelling service discovery framework must incorporate health checking and the
possibility of failures as well. It is not useful to know that Node A
provides the Foo service if that node has failed or the service crashed. Naive systems
make use of heartbeating, using periodic updates and TTLs. These schemes require work linear
to the number of nodes and place the demand on a fixed number of servers. Additionally, the
failure detection window is at least as long as the TTL. ZooKeeper provides ephemeral
nodes which are K/V entries that are removed when a client disconnects. These are more
sophisticated than a heartbeat system, but also have inherent scalability issues and add
client side complexity. All clients must maintain active connections to the ZooKeeper servers,
and perform keep-alives. Additionally, this requires "thick clients", which are difficult
to write and often result in difficult to debug issues.
Terraform uses a very different architecture for health checking. Instead of only
having server nodes, Terraform clients run on every node in the cluster.
These clients are part of a [gossip pool](/docs/internals/gossip.html), which
serves several functions including distributed health checking. The gossip protocol implements
an efficient failure detector that can scale to clusters of any size without concentrating
the work on any select group of servers. The clients also enable a much richer set of health checks to be run locally,
whereas ZooKeeper ephemeral nodes are a very primitive check of liveness. Clients can check that
a web server is returning 200 status codes, that memory utilization is not critical, there is sufficient
disk space, etc. The Terraform clients expose a simple HTTP interface and avoid exposing the complexity
of the system is to clients in the same way as ZooKeeper.
Terraform provides first class support for service discovery, health checking,
K/V storage, and multiple datacenters. To support anything more than simple K/V storage,
all these other systems require additional tools and libraries to be built on
top. By using client nodes, Terraform provides a simple API that only requires thin clients.
Additionally, the API can be avoided entirely by using configuration files and the
DNS interface to have a complete service discovery solution with no development at all.