No OneTemporary
Actions

Size

33 KB

Referenced Files

None

Subscribers

None

View Options

	diff --git a/src/docs/user/cluster/cluster.diviner b/src/docs/user/cluster/cluster.diviner
	index 93800c81bb..5cb1a2671e 100644
	--- a/src/docs/user/cluster/cluster.diviner
	+++ b/src/docs/user/cluster/cluster.diviner
	@@ -1,255 +1,251 @@
	@title Clustering Introduction
	@group cluster

	Guide to configuring Phabricator across multiple hosts for availability and
	performance.

	Overview
	========

	WARNING: This feature is a very early prototype; the features this document
	describes are mostly speculative fantasy.

	-Phabricator can be configured to run on mulitple hosts with redundant services
	+Phabricator can be configured to run on multiple hosts with redundant services
	to improve its availability and scalability, and make disaster recovery much
	easier.

	Clustering is more complex to setup and maintain than running everything on a
	single host, but greatly reduces the cost of recovering from hardware and
	network failures.

	Each Phabricator service has an array of clustering options that can be
	configured independently. Configuring a cluster is inherently complex, and this
	is an advanced feature aimed at installs with large userbases and experienced
	operations personnel who need this high degree of flexibility.

	The remainder of this document summarizes how to add redundancy to each
	service and where your efforts are likely to have the greatest impact.

	For additional guidance on setting up a cluster, see "Overlaying Services"
	and "Cluster Recipes" at the bottom of this document.


	Preparing for Clustering
	========================

	To begin deploying Phabricator in cluster mode, set up `cluster.addresses`
	in your configuration.

	-This option should contain a list of network addess blocks which are considered
	+This option should contain a list of network address blocks which are considered
	to be part of the cluster. Hosts in this list are allowed to bend (or even
	break) some of the security and policy rules when they make requests to other
	hosts in the cluster, so this list should be as small as possible. See "Cluster
	Whitelist Security" below for discussion.

	If you are deploying hardware in EC2, a reasonable approach is to launch a
	dedicated Phabricator VPC, whitelist the whole VPC as a Phabricator cluster,
	and then deploy only Phabricator services into that VPC.

	If you have additional auxiliary hosts which run builds and tests via Drydock,
	you should //not// include them in the cluster address definition. For more
	detailed discussion of the Drydock security model, see @{Drydock User Guide:
	Security}.

	Most other clustering features will not work until you define a cluster by
	configuring `cluster.addresses`.


	Cluster Whitelist Security
	========================

	When you configure `cluster.addresses`, you should keep the list of trusted
	cluster hosts as small as possible. Hosts on this list gain additional
	capabilities, including these:

	Trusted HTTP Headers: Normally, Phabricator distrusts the load balancer
	HTTP headers `X-Forwarded-For` and `X-Forwarded-Proto` because they may be
	client-controlled and can be set to arbitrary values by an attacker if no load
	balancer is deployed. In particular, clients can set `X-Forwarded-For` to any
	value and spoof traffic from arbitrary remotes.

	These headers are trusted when they are received from a host on the cluster
	address whitelist. This allows requests from cluster loadbalancers to be
	interpreted correctly by default without requiring additional custom code or
	configuration.

	Intracluster HTTP: Requests from cluster hosts are not required to use
	HTTPS, even if `security.require-https` is enabled, because it is common to
	terminate HTTPS on load balancers and use plain HTTP for requests within a
	cluster.

	Special Authentication Mechanisms: Cluster hosts are allowed to connect to
	other cluster hosts with "root credentials", and to impersonate any user
	account.

	The use of root credentials is required because the daemons must be able to
	bypass policies in order to function properly: they need to send mail about
	private conversations and import commits in private repositories.

	The ability to impersonate users is required because SSH nodes must receive,
	interpret, modify, and forward SSH traffic. They can not use the original
	credentials to do this because SSH authentication is asymmetric and they do not
	have the user's private key. Instead, they use root credentials and impersonate
	the user within the cluster.

	These mechanisms are still authenticated (and use asymmetric keys, like SSH
	does), so access to a host in the cluster address block does not mean that an
	-attacker can immediately compromise the cluster. However, an overbroad cluster
	+attacker can immediately compromise the cluster. However, an over-broad cluster
	address whitelist may give an attacker who gains some access additional tools
	to escalate access.

	Note that if an attacker gains access to an actual cluster host, these extra
	powers are largely moot. Most cluster hosts must be able to connect to the
	master database to function properly, so the attacker will just do that and
	freely read or modify whatever data they want.


	Cluster: Databases
	=================

	Configuring multiple database hosts is moderately complex, but normally has the
	highest impact on availability and resistance to data loss. This is usually the
	most important service to make redundant if your focus is on availability and
	disaster recovery.

	Configuring replicas allows Phabricator to run in read-only mode if you lose
	the master and to quickly promote the replica as a replacement.

	For details, see @{article:Cluster: Databases}.


	Cluster: Repositories
	=====================

	Configuring multiple repository hosts is complex, but is required before you
	can add multiple daemon or web hosts.

	Repository replicas are important for availability if you host repositories
	on Phabricator, but less important if you host repositories elsewhere
	(instead, you should focus on making that service more available).

	The distributed nature of Git and Mercurial tend to mean that they are
	naturally somewhat resistant to data loss: every clone of a repository includes
	the entire history.

	For details, see @{article:Cluster: Repositories}.


	Cluster: Daemons
	================

	Configuring multiple daemon hosts is straightforward, but you must configure
	repositories first.

	With daemons running on multiple hosts, you can transparently survive the loss
	of any subset of hosts without an interruption to daemon services, as long as
	at least one host remains alive. Daemons are stateless, so spreading daemons
	across multiple hosts provides no resistance to data loss.

	For details, see @{article:Cluster: Daemons}.


	Cluster: Web Servers
	====================

	Configuring multiple web hosts is straightforward, but you must configure
	repositories first.

	With multiple web hosts, you can transparently survive the loss of any subset
	of hosts as long as at least one host remains alive. Web hosts are stateless,
	so putting multiple hosts in service provides no resistance to data loss.

	For details, see @{article:Cluster: Web Servers}.


	Overlaying Services
	===================

	Although hosts can run a single dedicated service type, certain groups of
	services work well together. Phabricator clusters usually do not need to be
	very large, so deploying a small number of hosts with multiple services is a
	good place to start.

	In planning a cluster, consider these blended host types:

	Everything: Run HTTP, SSH, MySQL, repositories and daemons on a single
	host. This is the starting point for single-node setups, and usually also the
	best configuration when adding the second node.

	Everything Except Databases: Run HTTP, SSH, repositories and daemons on one
	host, and MySQL on a different host. MySQL uses many of the same resources that
	other services use. It's also simpler to separate than other services, and
	tends to benefit the most from dedicated hardware.

	-Just Databases: Separating MySQL onto dedicated nodes
	-
	-Database nodes tend to benefit the most from
	-
	Repositories and Daemons: Run repositories and daemons on the same host.
	Repository hosts //must// run daemons, and it normally makes sense to
	completely overlay repositories and daemons. These services tend to use
	different resources (repositories are heavier on I/O and lighter on CPU/RAM;
	daemons are heavier on CPU/RAM and lighter on I/O).

	Repositories and daemons are also both less latency sensitive than other
	-service types, so there's a wider margin of error for underprovisioning them
	-before performance is noticably affected.
	+service types, so there's a wider margin of error for under provisioning them
	+before performance is noticeably affected.

	These nodes tend to use system resources in a balanced way. Individual nodes
	in this class do not need to be particularly powerful.

	Frontend Servers: Run HTTP and SSH on the same host. These are easy to set
	up, stateless, and you can scale the pool up or down easily to meet demand.
	Routing both types of ingress traffic through the same initial tier can
	simplify load balancing.

	These nodes tend to need relatively little RAM.


	Cluster Recipes
	===============

	This section provides some guidance on reasonable ways to scale up a cluster.

	The smallest possible cluster is two hosts. Run everything (web, ssh,
	database, repositories, and daemons) on each host. One host will serve as the
	master; the other will serve as a replica.

	Ideally, you should physically separate these hosts to reduce the chance that a
	natural disaster or infrastructure disruption could disable or destroy both
	hosts at the same time.

	From here, you can choose how you expand the cluster.

	To improve scalability and performance, separate loaded services onto
	dedicated hosts and then add more hosts of that type to increase capacity. If
	you have a two-node cluster, the best way to improve scalability by adding one
	host is likely to separate the master database onto its own host.

	Note that increasing scale may //decrease// availability by leaving you with
	too little capacity after a failure. If you have three hosts handling traffic
	and one datacenter fails, too much traffic may be sent to the single remaining
	host in the surviving datacenter. You can hedge against this by mirroring new
	hosts in other datacenters (for example, also separate the replica database
	onto its own host).

	After separating databases, separating repository + daemon nodes is likely
	the next step.

	To improve availability, add another copy of everything you run in one
	datacenter to a new datacenter. For example, if you have a two-node cluster,
	the best way to improve availability is to run everything on a third host in a
	third datacenter. If you have a 6-node cluster with a web node, a database node
	and a repo + daemon node in two datacenters, add 3 more nodes to create a copy
	of each node in a third datacenter.

	You can continue adding hosts until you run out of hosts.


	Next Steps
	==========

	Continue by:

	- learning how Phacility configures and operates a large, multi-tenant
	production cluster in ((cluster)).
	diff --git a/src/docs/user/cluster/cluster_daemons.diviner b/src/docs/user/cluster/cluster_daemons.diviner
	index 19e7e37f6d..f6aa1cbe74 100644
	--- a/src/docs/user/cluster/cluster_daemons.diviner
	+++ b/src/docs/user/cluster/cluster_daemons.diviner
	@@ -1,59 +1,59 @@
	@title Cluster: Daemons
	@group intro

	Configuring Phabricator to use multiple daemon hosts.

	Overview
	========

	WARNING: This feature is a very early prototype; the features this document
	describes are mostly speculative fantasy.

	You can run daemons on multiple hosts. The advantages of doing this are:

	- you can completely survive the loss of multiple daemon hosts; and
	- worker queue throughput may improve.

	This configuration is simple, but you must configure repositories first. For
	details, see @{article:Cluster: Repositories}.

	Since repository hosts must run daemons anyway, you usually do not need to do
	any additional work and can skip this entirely.


	Adding Daemon Hosts
	===================

	After configuring repositories for clustering, launch daemons on every
	repository host according to the documentation in
	@{article:Cluster: Repositories}. These daemons are necessary: repositories
	will not fetch, update, or synchronize properly without them.

	-If your repository clustering is redundant (you have at least two repsoitory
	+If your repository clustering is redundant (you have at least two repository
	hosts), these daemons are also likely to be sufficient in most cases. If you
	want to launch additional hosts anyway (for example, to increase queue capacity
	for unusual workloads), see "Dedicated Daemon Hosts" below.


	Dedicated Daemon Hosts
	======================

	You can launch additional daemon hosts without any special configuration.
	Daemon hosts must be able to reach other hosts on the network, but do not need
	to run any services (like HTTP or SSH). Simply deploy the Phabricator software
	and configuration and start the daemons.

	Normally, there is little reason to deploy dedicated daemon hosts. They can
	improve queue capacity, but generally do not improve availability or increase
	resistance to data loss on their own. Instead, consider deploying more
	repository hosts: repository hosts run daemons, so this will increase queue
	capacity but also improve repository availability and cluster resistance.


	Next Steps
	==========

	Continue by:

	- returning to @{article:Clustering Introduction}; or
	- configuring repositories first with @{article:Cluster: Repositories}.
	diff --git a/src/docs/user/cluster/cluster_databases.diviner b/src/docs/user/cluster/cluster_databases.diviner
	index 03c9619c97..234db4d86a 100644
	--- a/src/docs/user/cluster/cluster_databases.diviner
	+++ b/src/docs/user/cluster/cluster_databases.diviner
	@@ -1,322 +1,322 @@
	@title Cluster: Databases
	@group intro

	Configuring Phabricator to use multiple database hosts.

	Overview
	========

	WARNING: This feature is a very early prototype; the features this document
	describes are mostly speculative fantasy.

	You can deploy Phabricator with multiple database hosts, configured as a master
	and a set of replicas. The advantages of doing this are:

	- faster recovery from disasters by promoting a replica;
	- graceful degradation if the master fails;
	- reduced load on the master; and
	- some tools to help monitor and manage replica health.

	This configuration is complex, and many installs do not need to pursue it.

	Phabricator can not currently be configured into a multi-master mode, nor can
	it be configured to automatically promote a replica to become the new master.

	If you lose the master, Phabricator can degrade automatically into read-only
	mode and remain available, but can not fully recover without operational
	intervention unless the master recovers on its own.


	Setting up MySQL Replication
	============================

	TODO: Write this section.


	Configuring Replicas
	====================

	Once your replicas are in working order, tell Phabricator about them by
	configuring the `cluster.database` option. This option must be configured from
	the command line or in configuration files because Phabricator needs to read
	it //before// it can connect to databases.

	This option value will list all of the database hosts that you want Phabricator
	to interact with: your master and all your replicas. Each entry in the list
	should have these keys:

	- `host`: //Required string.// The database host name.
	- `role`: //Required string.// The cluster role of this host, one of
	`master` or `replica`.
	- `port`: //Optional int.// The port to connect to. If omitted, the default
	port from `mysql.port` will be used.
	- `user`: //Optional string.// The MySQL username to use to connect to this
	host. If omitted, the default from `mysql.user` will be used.
	- `pass`: //Optional string.// The password to use to connect to this host.
	If omitted, the default from `mysql.pass` will be used.
	- `disabled`: //Optional bool.// If set to `true`, Phabricator will not
	connect to this host. You can use this to temporarily take a host out
	of service.

	When `cluster.databases` is configured the `mysql.host` option is not used.
	The other MySQL connection configuration options (`mysql.port`, `mysql.user`,
	`mysql.pass`) are used only to provide defaults.

	Once you've configured this option, restart Phabricator for the changes to take
	effect, then continue to "Monitoring Replicas" to verify the configuration.


	Monitoring Replicas
	===================

	You can monitor replicas in {nav Config > Cluster Databases}. This interface
	shows you a quick overview of replicas and their health, and can detect some
	common issues with replication.

	The table on this page shows each database and current status.

	NOTE: This page runs its diagnostics //from the web server that is serving the
	request//. If you are recovering from a disaster, the view this page shows
	may be partial or misleading, and two requests served by different servers may
	see different views of the cluster.

	Connection: Phabricator tries to connect to each configured database, then
	shows the result in this column. If it fails, a brief diagnostic message with
	details about the error is shown. If it succeeds, the column shows a rough
	measurement of latency from the current webserver to the database.

	Replication: This is a summary of replication status on the database. If
	things are properly configured and stable, the replicas should be actively
	replicating and no more than a few seconds behind master, and the master
	should //not// be replicating from another database.

	To report this status, the user Phabricator is connecting as must have the
	`REPLICATION CLIENT` privilege (or the `SUPER` privilege) so it can run the
	`SHOW SLAVE STATUS` command. The `REPLICATION CLIENT` privilege only enables
	the user to run diagnostic commands so it should be reasonable to grant it in
	most cases, but it is not required. If you choose not to grant it, this page
	can not show any useful diagnostic information about replication status but
	everything else will still work.

	If a replica is more than a second behind master, this page will show the
	current replication delay. If the replication delay is more than 30 seconds,
	it will report "Slow Replication" with a warning icon.

	If replication is delayed, data is at risk: if you lose the master and can not
	later recover it (for example, because a meteor has obliterated the datacenter
	housing the physical host), data which did not make it to the replica will be
	lost forever.

	Beyond the risk of data loss, any read-only traffic sent to the replica will
	see an older view of the world which could be confusing for users: it may
	appear that their data has been lost, even if it is safe and just hasn't
	replicated yet.

	Phabricator will attempt to prevent clients from seeing out-of-date views, but
	sometimes sending traffic to a delayed replica is the best available option
	(for example, if the master can not be reached).

	Health: This column shows the result of recent health checks against the
	server. After several checks in a row fail, Phabricator will mark the server
	as unhealthy and stop sending traffic to it until several checks in a row
	later succeed.

	Note that each web server tracks database health independently, so if you have
	several servers they may have different views of database health. This is
	normal and not problematic.

	For more information on health checks, see "Unreachable Masters" below.

	Messages: This column has additional details about any errors shown in the
	other columns. These messages can help you understand or resolve problems.


	Testing Replicas
	================

	To test that your configuration can survive a disaster, turn off the master
	database. Do this with great ceremony, making a cool explosion sound as you
	run the `mysqld stop` command.

	If things have been set up properly, Phabricator should degrade to a temporary
	read-only mode immediately. After a brief period of unresponsiveness, it will
	degrade further into a longer-term read-only mode. For details on how this
	-works interanlly, see "Unreachable Masters" below.
	+works internally, see "Unreachable Masters" below.

	Once satisfied, turn the master back on. After a brief delay, Phabricator
	should recognize that the master is healthy again and recover fully.

	Throughout this process, the {nav Cluster Databases} console will show a
	current view of the world from the perspective of the web server handling the
	request. You can use it to monitor state.

	You can perform a more narrow test by enabling `cluster.read-only` in
	configuration. This will put Phabricator into read-only mode immediately
	without turning off any databases.

	You can use this mode to understand which capabilities will and will not be
	available in read-only mode, and make sure any information you want to remain
	accessible in a disaster (like wiki pages or contact information) is really
	accessible.

	See the next section, "Degradation to Read Only Mode", for more details about
	when, why, and how Phabricator degrades.

	If you run custom code or extensions, they may not accommodate read-only mode
	properly. You should specifically test that they function correctly in
	read-only mode and do not prevent you from accessing important information.


	Degradation to Read-Only Mode
	=============================

	Phabricator will degrade to read-only mode when any of these conditions occur:

	- you turn it on explicitly;
	- you configure cluster mode, but don't set up any masters;
	- the master can not be reached while handling a request; or
	- recent attempts to connect to the master have consistently failed.

	When Phabricator is running in read-only mode, users can still read data and
	browse and clone repositories, but they can not edit, update, or push new
	changes. For example, users can still read disaster recovery information on
	the wiki or emergency contact information on user profiles.

	You can enable this mode explicitly by configuring `cluster.read-only`. Some
	reasons you might want to do this include:

	- to test that the mode works like you expect it to;
	- to make sure that information you need will be available;
	- to prevent new writes while performing database maintenance; or
	- to permanently archive a Phabricator install.

	You can also enable this mode implicitly by configuring `cluster.databases`
	but disabling the master, or by not specifying any host as a master. This may
	be more convenient than turning it on explicitly during the course of
	operations work.

	If Phabricator is unable to reach the master database, it will degrade into
	read-only mode automatically. See "Unreachable Masters" below for details on
	how this process works.

	If you end up in a situation where you have lost the master and can not get it
	back online (or can not restore it quickly) you can promote a replica to become
	the new master. See the next section, "Promoting a Replica", for details.


	Promoting a Replica
	===================

	TODO: Write this section.


	Unreachable Masters
	===================

	This section describes how Phabricator determines that a master has been lost,
	marks it unreachable, and degrades into read-only mode.

	Phabricator degrades into read-only mode automatically in two ways: very
	briefly in response to a single connection failure, or more permanently in
	response to a series of connection failures.

	In the first case, if a request needs to connect to the master but is not able
	to, Phabricator will temporarily degrade into read-only mode for the remainder
	of that request. The alternative is to fail abruptly, but Phabricator can
	sometimes degrade successfully and still respond to the user's request, so it
	makes an effort to finish serving the request from replicas.

	If the request was a write (like posting a comment) it will fail anyway, but
	if it was a read that did not actually need to use the master it may succeed.

	This temporary mode is intended to recover as gracefully as possible from brief
	interruptions in service (a few seconds), like a server being restarted, a
	network link becoming temporarily unavailable, or brief periods of load-related
	disruption. If the anomaly is temporary, Phabricator should recover immediately
	(on the next request once service is restored).

	This mode can be slow for users (they need to wait on connection attempts to
	the master which fail) and does not reduce load on the master (requests still
	attempt to connect to it).

	The second way Phabricator degrades is by running periodic health checks
	against databases, and marking them unhealthy if they fail over a longer period
	of time. This mechanism is very similar to the health checks that most HTTP
	load balancers perform against web servers.

	If a database fails several health checks in a row, Phabricator will mark it as
	unhealthy and stop sending all traffic (except for more health checks) to it.
	This improves performance during a service interruption and reduces load on the
	master, which may help it recover from load problems.

	You can monitor the status of health checks in the {nav Cluster Databases}
	console. The "Health" column shows how many checks have run recently and
	how many have succeeded.

	Health checks run every 3 seconds, and 5 checks in a row must fail or succeed
	before Phabricator marks the database as healthy or unhealthy, so it will
	generally take about 15 seconds for a database to change state after it goes
	down or comes up.

	If all of the recent checks fail, Phabricator will mark the database as
	unhealthy and stop sending traffic to it. If the master was the database that
	was marked as unhealthy, Phabricator will actively degrade into read-only mode
	until it recovers.

	This mode only attempts to connect to the unhealthy database once every few
	seconds to see if it is recovering, so performance will be better on average
	(users rarely need to wait for bad connections to fail or time out) and the
	-datbase will receive less load.
	+database will receive less load.

	Once all of the recent checks succeed, Phabricator will mark the database as
	healthy again and continue sending traffic to it.

	Health checks are tracked individually for each web server, so some web servers
	may see a host as healthy while others see it as unhealthy. This is normal, and
	can accurately reflect the state of the world: for example, the link between
	datacenters may have been lost, so hosts in one datacenter can no longer see
	the master, while hosts in the other datacenter still have a healthy link to
	it.


	Backups
	======

	Even if you configure replication, you should still retain separate backup
	snapshots. Replicas protect you from data loss if you lose a host, but they do
	not let you recover from data mutation mistakes.

	If something issues `DELETE` or `UPDATE` statements and destroys data on the
	master, the mutation will propagate to the replicas almost immediately and the
	data will be gone forever. Normally, the only way to recover this data is from
	backup snapshots.

	Although you should still have a backup process, your backup process can
	safely pull dumps from a replica instead of the master. This operation can
	-be slow, so offloading it to a replica can make the perforance of the master
	+be slow, so offloading it to a replica can make the performance of the master
	more consistent.

	To dump from a replica, wait for this TODO to be resolved and then do whatever
	it says to do:

	TODO: Make `bin/storage dump` replica-aware. See T10758.

	With recent versions of MySQL, it is also possible to configure a //delayed//
	replica which intentionally lags behind the master (say, by 12 hours). In the
	event of a bad mutation, this could give you a larger window of time to
	recognize the issue and recover the lost data from the delayed replica (which
	might be quick) without needing to restore backups (which might be very slow).

	Delayed replication is outside the scope of this document, but may be worth
	considering as an additional data security step on top of backup snapshots
	depending on your resources and needs. If you configure a delayed replica, do
	not add it to the `cluster.databases` configuration: Phabricator should never
	send traffic to it, and does not need to know about it.


	Next Steps
	==========

	Continue by:

	- returning to @{article:Clustering Introduction}.
	diff --git a/src/docs/user/cluster/cluster_repositories.diviner b/src/docs/user/cluster/cluster_repositories.diviner
	index c2750fbd78..c5179666a7 100644
	--- a/src/docs/user/cluster/cluster_repositories.diviner
	+++ b/src/docs/user/cluster/cluster_repositories.diviner
	@@ -1,112 +1,112 @@
	@title Cluster: Repositories
	@group intro

	Configuring Phabricator to use multiple repository hosts.

	Overview
	========

	WARNING: This feature is a very early prototype; the features this document
	describes are mostly speculative fantasy.

	If you use Git or Mercurial, you can deploy Phabricator with multiple
	repository hosts, configured so that each host is readable and writable. The
	advantages of doing this are:

	- you can completely survive the loss of repository hosts;
	- reads and writes can scale across multiple machines; and
	- read and write performance across multiple geographic regions may improve.

	This configuration is complex, and many installs do not need to pursue it.

	This configuration is not currently supported with Subversion.


	Repository Hosts
	================

	Repository hosts must run a complete, fully configured copy of Phabricator,
	including a webserver. If you make repositories available over SSH, they must
	also run a properly configured `sshd`.

	Generally, these hosts will run the same set of services and configuration that
	web hosts run. If you prefer, you can overlay these services and put web and
	repository services on the same hosts.

	When a user requests information about a repository that can only be satisfied
	-by examining a repository working copy, the webserver receiving the reqeust
	+by examining a repository working copy, the webserver receiving the request
	will make an HTTP service call to a repository server which hosts the
	repository to retrieve the data it needs. It will use the result of this query
	to respond to the user.


	How Reads and Writes Work
	=========================

	Phabricator repository replicas are multi-master: every node is readable and
	writable, and a cluster of nodes can (almost always) survive the loss of any
	arbitrary subset of nodes so long as at least one node is still alive.

	Phabricator maintains an internal version for each repository, and increments
	it when the repository is mutated.

	Before responding to a read, replicas make sure their version of the repository
	is up to date (no node in the cluster has a newer version of the repository).
	If it isn't, they block the read until they can complete a fetch.

	Before responding to a write, replicas obtain a global lock, perform the same
	version check and fetch if necessary, then allow the write to continue.


	HTTP vs HTTPS
	=============

	Intracluster requests (from the daemons to repository servers, or from
	webservers to repository servers) are permitted to use HTTP, even if you have
	set `security.require-https` in your configuration.

	It is common to terminate SSL at a load balancer and use plain HTTP beyond
	that, and the `security.require-https` feature is primarily focused on making
	client browser behavior more convenient for users, so it does not apply to
	intracluster traffic.

	Using HTTP within the cluster leaves you vulnerable to attackers who can
	observe traffic within a datacenter, or observe traffic between datacenters.
	This is normally very difficult, but within reach for state-level adversaries
	like the NSA.

	If you are concerned about these attackers, you can terminate HTTPS on
	repository hosts and bind to them with the "https" protocol. Just be aware that
	the `security.require-https` setting won't prevent you from making
	configuration mistakes, as it doesn't cover intracluster traffic.

	Other mitigations are possible, but securing a network against the NSA and
	similar agents of other rogue nations is beyond the scope of this document.


	Backups
	======

	Even if you configure clustering, you should still consider retaining separate
	backup snapshots. Replicas protect you from data loss if you lose a host, but
	they do not let you rewind time to recover from data mutation mistakes.

	If something issues a `--force` push that destroys branch heads, the mutation
	will propagate to the replicas.

	You may be able to manually restore the branches by using tools like the
	Phabricator push log or the Git reflog so it is less important to retain
	repository snapshots than database snapshots, but it is still possible for
	data to be lost permanently, especially if you don't notice the problem for
	some time.

	Retaining separate backup snapshots will improve your ability to recover more
	data more easily in a wider range of disaster situations.


	Next Steps
	==========

	Continue by:

	- returning to @{article:Clustering Introduction}.

File Metadata

Mime Type: text/x-diff
Expires: Mon, Jul 28, 2:29 AM (1 w, 19 h ago)
Storage Engine: blob
Storage Format: Raw Data
Storage Handle: 186356
Default Alt Text: (33 KB)

No OneTemporaryActions

View Options

File Metadata

Event Timeline

No OneTemporary
Actions