What delegators need to know about Cardano stake pool infrastructure
Cardano blockchain is running on federated nodes since it’s been created. The nodes are managed by IOHK, Cardano Foundation and Emurgo. Soon, with the release of the Shelley project, Cardano blockchain will be running decentralized and will no longer be under the management of IOHK but rather under the management of Cardano community stake pool operators.
IOHK will transition from the federated nodes to the decentralized Cardano network in multiple steps. Decentralizing the network is a complex process and IOHK wants to make sure it is done properly. So any problems should arise during the transition, the IOHK’s federated nodes will be there as a backup.
As the pool operators gradually take over the job of running the network many questions arise and one of them is what infrastructure they will used to run the world financial operating system which Cardano aims to be.
A stake pool (a Cardano node) can run on various operating systems. For the incentivized testnet (ITN) the node called Jormungandr is available for different operating systems (OS) and processor architectures (see figure below). The most known are Linux, Windows and MacOS. We can include also Raspbian, as there are a few pool operators who run their nodes on a Raspberry pi 4 or Rock pi with ourselves included (just for experimental purposes though).
One can buy a physical server or rent it from a provider (host). When you own the physical server you usually have the physical access to it while renting a server often means that you will access and manage it remotely.
Generally physical servers have a better performance because of a simple reason. There are no additional software layers between the OS and the server as opposed to a virtual server. On the figure below, a virtual server is represented by a single APP + OS. There can be many applications running on the OS of course. The figure shows 3 virtual servers but in practice there can be many more – it all depends on the underlying hardware performance capabilities. The servers do not access the hardware directly. Instead they communicate through virtual hardware which is another software layer. Hypervisor is the virtualization software which is used to manage virtual servers and it runs on the physical hardware.
You can run a virtual server at home on your own desktop computer or on a dedicated server. Another option is to run a virtual private server in the cloud from one of many cloud service providers like Amazon, Microsoft Azure, Google Cloud, Digital Ocean and many smaller ones.
Running a stake pool
A stake pool operator needs to consider many factors when deciding whether he wants to run a stake pool on his own bare metal server or run it in the cloud. For example:
- power supply
- internet connection
- hardware failure
- upgrades – hardware / software
Environment is something that many people don’t think about but can play an important role. If a pool operator runs his node on a hosted bare metal server then he probably does not need to worry about it as it’s usually taken care of by the hosting provider.
However, in cases when the server is run at home, a stake pool operator needs to assure that the server is not put in a small closed space. The servers can produce quite a lot of heat and there should be sufficient airflow of cool air to prevent the servers to overheat. Nowadays processors can adjust their working frequency and slow down to prevent overheating but in extreme cases the whole system can shut down automatically to prevent hardware malfunction.
In order for the server to run uninterruptedly a stake pool operator needs to ensure that its power supply is constant. An uninterruptible power supply (UPS) is a must since you never know when the guys from electrical company come to “play” with the electrical cables in the neighborhood or the power outage occurs at the first storm. Sometimes it can take them hours to fix the problem and UPS won’t help in such cases at all since its purpose is to supply power for short power interruptions of a few minutes or so. With costlier equipment it could be prolonged a bit further though. Another option is to have a generator than can produce electricity but that can be pretty loud.
As with the power supply a stake pool operator that is running his server at home also needs to have a redundant internet connection for the same reasons mentioned before. The redundant connection can be just another physical line in his house. That however is not enough to cover all cases. A very well known stake pool operator’s server was inaccessible by the Cardano network for a few hours because his internet lines were cut off by some workers doing whatever they were doing. The slots that his server had assigned have never transformed into blocks and the delegators have never seen that ADA.
Another problem with internet connection can be the type of the IP address. Usually non businesses have dynamic IPs. That means that the IP of the router you have at home changes. How often it changes it depends on the service provider but it is not static and for running a stake pool you need to have a static IP. So the stake pool operator needs to check that out as well.
Hardware failure can happen anytime. Usually it’s a hard disk and sometimes even the server’s power supply. When running a server you need to have redundant hard disks installed and configured accordingly into a redundant array of disks (RAID). Sometimes it can happen that 2 or more disks die at the same time and even a RAID configuration (depend on what type of RAID is used) may not be enough.
While other server’s components usually don’t get broken as often as hard drives it can still happen and stake pool operator needs to address those cases too. Generally he has just two options: take a chance or increase the expenses (fix costs or prevention).
Upgrades are a normal part of any running system. Whether we are talking about hardware or software upgrades. Most of the times the hardware upgrades happen after a failed hardware component. A stake pool operator running a bare metal server at home cannot have all the spare parts in stock. That would be ridiculously expensive. So it can take a while for him to get the spare parts and fix the issue.
Software upgrades are usually more often than hardware upgrades especially if the pool operator wants to keep his pool up to date and install all the security patches provided by the OS manufacturer. Those can affect the uptime too but they are needed in any case no matter what infrastructure is chosen for the stake pool.
Bare metal server
Bare metal server definitely has a better performance than a virtual server. but the real question here is whether that performance is really needed to run a Cardano stake pool (node) and if it really makes a bare metal server superior to a virtual server.
A Cardano node is not CPU intensive at all except for occasional CPU spikes and the syncing process which happens on the restart of a node or when the node is started for the first time. The fact that it can be run on a Raspberry pi 4 says it all. Of course, a stake pool operator runs 2 or 3 instances of the node on the same server which is not possible to do on a Raspberry pi 4 but you can have a cluster of 2 or 3 Raspberry pi 4 computers each running its own instance of a Cardano node.
A Cardano node does not use much RAM either. 4GB is enough to run a single instance of a Cardano node.
There is one more factor in addition to the CPU/RAM/DISK that can have an impact on the performance of a Cardano stake pool and that is latency.
A low latency means that a stake pool can communicate with other stake pools very quickly. This is very important since the blocks need to be minted in specific time slots assigned by the Cardano blockchain. It becomes even more important when 2 stake pools produce a block for the same slot. The one that is going to win is the one that will propagate its block through the network faster. Here things get more complicated. A Cardano node is not connected to all nodes but all nodes together are connected with each other. So while a low latency is with no doubt one of the most important factors it’s still does not mean everything. Your node does not need to have the lowest latency. As longs as it’s under 50ms to the rest of the nodes it should be ok. Besides that a stake pool does not compete for the slots on a daily basis.
When a stake pool operator decides to run his Cardano node on a bare metal server at his own premises he needs to consider many things that can have a big impact on the node’s uptime like we described in previous sections. To ensure a high availability with a 99.99% uptime, which is offered by any serious cloud provider today, can be quite a challenge. Even more so if it is just one person operating the node. An uptime of 99.99% means a downtime of just 52 minutes per year. Imagine that! That’s a really narrow time window in case of a hardware failure or power outage.
Many of the biggest applications in the world (Office365, gmail, youtube, mobile applications and games, etc.) that we use every day run in the cloud with the help of virtualization technology. Virtualization with the combination of containerization allows for really scalable solutions. If a server needs more resources it can be upgraded seamlessly with a few clicks in a few minutes. Virtual servers run in data centers where professionals take care of all the infrastructure.
Virtualization make it also possible to move the servers easily. With the help of server migration services some of the cloud services providers make it possible to move a virtual machine from one provider to the other.
Like mentioned in the begging of this post, even IOHK uses the cloud to run their Cardano nodes.
Fans of the bare metal server might argue that having Cardano nodes running in the cloud is against the idea of decentralization as the servers are not in the hands of people but under corporations running the infrastructure. Sometimes you hear them say that if a corporation decides to shutdown Cardano nodes they can do so, like Google did with some Youtube content creators (by the way, all those creators are back as far as we know), etc. We think the chances for that to happen are lower than the chances of a hardware failure. Pool operators who choose virtual private servers pay to the companies for the service. So it really does not make much sense for the providers to shut the servers down. If one provider does that, other providers will get his customers. If all of them do that, then even the bare metal servers won’t be minting blocks anymore because the same can be applied with the internet service providers. Finally, the decentralization is not just a question of which company the stake pools are running with but also how many stake pools a single stake operator runs.
If all of them do that, then even the bare metal servers won’t be minting blocks anymore because the same can be applied with the internet service providers.
We are not saying that running a bare metal server is bad. What we are saying is the bare metal servers are not superior. At the end, there is just one simple question to be asked: Does it work?
And the answer is yes. It does work. Both bare metal servers and virtual servers from different cloud services providers have excellent performance on the ITN. The figure below shows the number of server per infrastructure type. You can also see that majority of stake goes with the bare metal servers. However, the battles won by those servers are not the highest even though the hosted bare metal servers have one of the highest percentage of won battles.
The decision to go with a bare metal server or a virtual server has more to do with the economics than anything else. Some pool operators estimate that running a server in the cloud is cheaper while others think otherwise. That’s like with the crypto or stock market where there is always someone who thinks the market will go up and someone who thinks it will go down.
The decision to go with a bare metal server or a virtual server has more to do with the economics than anything else.
If you are still not convinced we invite you to listen to dev-ops engineer Samuel Leathers from IOHK.
To be honest, the best thing for Cardano blockchain is to have a diverse infrastructure in different locations of the world. So, dear delegator, if you are looking to get as much as possible from your staked ADA you should base your decision on other factors than just the infrastructure type. More about that in a future post.
At the end we would like to leave you with a poem from Percy Bysshe Shelley himself. Happy staking!
If you find this post useful consider staking with us. Our ticker is #TILX.