The Art of achieving the Impossible

ExasolMiniCluster

I am so pleased to announce that the Atheon Exasol Mini Cluster is now up and running!

Also known as the “Chinese Suitcase” and the “Atheon Holdall”, the Mini Cluster has been a few weeks in development, whilst I battled with a handful of minor issues.

The kit list for the Mini Cluster (in a 2+1 architecture) is:

The specs of the mini PC are:

  • Intel Core i7 (mobile processor)
  • 16GB RAM capacity
  • 2 gigabit NICs

The key things to note are:

  • 2 drives are required for the data nodes (only one for the Licence node)
  • 2 network cards are required for every node
  • An Exasol licence is required to operate the nodes together (talk to your Exasol Account manager, or get in touch and I’ll direct you, although this IS NOT SUPPORTED)

With costs in the cloud, looking to be a recurring £1600 per month, the total cost for the whole setup was around £2300, which will prove an invaluable saving for a small company. This cluster provides a great low-cost development environment, or could be used for a PoC, or as a “portable” (like TVs in the 90’s) Exasol implementation. More portable than a server rack in a flight case anyway!

Next up, is to get a mini rack machined out of aluminium for it, I think people might start to miss their Lego before too long!

I’d not have been able to build this without the help of the Exasol support guys, who were very patient with my constant persistence to build this, so big thanks to them.

“The difference between the impossible and the possible lies in a man’s determination.” – Tommy Lasorda

 

Aside

2+1 Architecture in Exasol (with minimum or Redundancy 2)

This blog post feeds on from 2 others, which I’d recommend reading before this one

Data Redundancy 1 in Exasol

Data Redundancy 2 in Exasol

In a 2+1 architecture we have the same setup as before (2 Data nodes), with the exception of an additional node. So, we have Node 11 and Node 12, these are our “Active” data nodes as before. However, we add another Node, Node 13. Node 13 is the “Spare” or “Reserved” node. We will also depend on Redundancy 2 here.

Redundancy0_In2plus1

With a ‘+1’ architecture, if a node fails, it is automatically replaced by the Spare node. The slave replica segment of the failed nodes data is pulled from the failed nodes’ neighbour node, and is restored on the implemented Spare node.

For example, say Node 12 fails. It’s master data is gone, obliterated. The database is down! No problem. Node 13 is pushed in Node 12’s place. The slave segment 2 is pulled from Node 11 (great that we’ve got an exact copy huh!).

The database restarts, then the segment is restored onto Node 13 as the Master segment, with the end user knowing no different (apart from a hopefully brief outage).

The process of recovery in a ‘+1’ architecture is as follows:

1 – Database discovers failure

2- Database shuts down

3 – The failed node is automatically replaced with the reserved node

4 – Database starts

5 – Data is restored from the slave segment on the neighbour node onto the Spare node as the Master segment

The only place where it is logged that the failover has happened is the log. Notifications of this occurring would be flagged in Warning and Critical severity levels.

Aside

Data Redundancy 2 in Exasol

Before reading this post, it might add a bit of context to read the first post in this series, about Redundancy 1 and a 2+0 architecture.

Redundancy 2

Using the 2+0 architecture from the previous article, we add a “Redundancy” or “Slave” segment to each node. Node 11 will have Slave segment 2, and Node 12 will have Slave segment 1. The slave segments need to be the same size as the Master segment sizes; so in this case, also 4GB. This will take the total Volume size to 8GB, requiring a total disk space of 16GB.

Back to our slave segments. The slave on Node 11 is an exact copy of the Master segment 2 on Node 12. Likewise, the slave segment on Node 12 is the exact copy of Master segment 1 on Node 11. This is managed by data being committed to both the master and corresponding slave segment simultaneously. A commit is only valid if both node segments accept it.

Redundancy 2 ensures that there is always a live copy of data, in the event of a failure, so that the system can suffer as limited downtime as possible (data availability), but also to ensure data integrity.

With Redundancy enabled, the nodes are generating more network traffic. To ensure the best possible performance, it is recommended to have 1 dedicated network interface for storage (data to disk) and 1 dedicated network for database traffic (querying).

In a ‘+0’ approach, the database is down if one node fails. No recovery is available until the node is fixed first. Not good in a production environment.

In the next post we’ll look at a 2+1 architecture, to get the database back up and running pronto!

 

Aside

Data Redundancy 1 in Exasol

Before we get to Data Redundancy, it’d be useful to understand the architecture options available to us. Let’s start with ‘2+0’.

2+0

In 2+0, we have the licence node (as always), and then 2 data nodes in the cluster (hence the 2). There is no failover node (hence 0). As you can see below…

Redundancy0_In2plus0

So what is data redundancy? Data redundancy is essentially storing multiple “copies” of the data on other nodes in case the primary keeper of that data fails.

Each cluster node employs at least one Master Segment, where the data is stored. In our example, as above, we are running a 2+0 architecture, which consists of Node 11 and Node 12. Both are “Active” nodes.

Redundancy 1

Node11 has Master Segment 11 and Node 12 has Master Segment 12. Both segments are sized at 4GB a piece. In this instance we are running a Redundancy 1 approach. Redundancy 1 means that each node must remain online for the data to be accessible. There are no Slave Segments or copies. If for whatever reason Node 11 became inoperable, the data from Node 11 is at risk, and cannot be recovered, until that node has been repaired, or a backup is available.

Essentially in Redundancy 1, there is no database redundancy.

Next time, we’ll focus on a more robust Redundancy 2 (or more) approach.