Image

Changes are afoot at Exasol

(image from Exasol.com)

A notification popped up on my LinkedIn today… a letter from the CEO. Obviously not to me, but it seems to the community in general.

Aaron Auld, Exasol’s CEO, shares with us some of the changes to come with Exasol. Aaron discusses how the customer and community is coming to the forefront of the coming changes within Exasol, with customer feedback being incorporated and new content coming for the community.

It’s hard not to notice they’ve had a bit of a website revamp (looking good, good-looking!), I can’t wait for the Jira backend to get the same treatment!

 

What is ExaBucketFS?

So what is ExaBucketFS?

ExaBucketFS is essentially just a file system or store within the cluster. Files are automatically replicated across all nodes.

There are two main use cases (that I know of) to use ExaBucketFS. Firstly, it is as a repository for our in-house and third party libraries (e.g Python, Java, R etc). Secondly, it  allows us to store binary data, such as trained statistical models; where the Exasol database cannot traditionally store binary data.

Previously, before version 6, the Exasol cluster required access to the internet to be able to get the libraries used by UDFs. For custom libraries, you would have had to setup your own repository, and again be able to access it over the internet.

ExaBucketFS does require some administration though.

To use ExaBucketFS there is an API to allow you to put/get/remove files into the cluster. You can do this using curl. In the last post, I discussed how to use Curl to upload files to the bucket. The files added to the bucket are then accessible for use within UDFs.

Buckets can be password protected for reading or writing, or left public, although this to me seems like a purely DBA type task – get a password on them!

There is also the following do’s and don’ts for ExaBucketFS:

  • Ensure that you don’t write to buckets concurrently – Buckets are non-transactional.
  • Buckets and files do not get backed up – so you need to make sure you have this backed up somewhere else (by your own means!)
  • Don’t use ExaBucket as storage, as there is 100% replication across all nodes; meaning that if you store a file once, it will be replicated on every node – consuming your disk space.

Exasol’s knowledgeable Mathias Brink describes ExaBucketFS in one of the Exasol videos here.

HOW TO: Get your UDFs working with Python libraries after Exasol 6 upgrade

We recently took the plunge and upgraded to Exasol V6. There’s a few changes that you can find about, listed in the upgrade notes here. Like all upgrades, I thoroughly recommend you study these notes in detail – as there can always be something to trip you up.

This happened to us, with the changed implementation of UDFs and the way libraries are stored and used. You can read all about how it used to work here, but for now let’s go through how to make your Python libraries work with your UDFs again! I’ll be splitting up the content of this blog into a couple of future ones too, as there’s quite a lot to go through!

Get Curl!

First off, you’re going to need Curl. Get it here. Extract it on your computer, and remember where you put it!

Setup Bucket Services

Next up, let’s check out the changes in ExaOperation. There’s the addition of this new tab of ExaBuckets. Head on in there, and click Add.AddBucketService

By default the HTTP port for the service bucket is 2580, if you create another service bucket after this, you’ll need to change the port number (which I need to here – so I’m using 2585). Add a Description, this will be how you access the bucket. I’ll use “Demo”.

My ExaBuckets now looks like this. It has bfsdefault – which will be the service bucket you have, by default, when you get Version 6.

bucketservices

Click on the ID of the service you just created. This will take you through to where we can create the actual bucket. Click on Add.

nobuckets

I’ve added the name “libraries”, we’ll need this later. I’ve checked Public readable, and provided read (‘xyz’) and write passwords (123) for the demo.

newbucket

Your bucket will be displayed like this.

bucketcreated

Upload library to Bucket using Curl

Ok, now let’s use the bucket we created. I’m going to be using Windows, so open up a cmd window. Change the directory to your Curl one from earlier. Let’s list out what we find…. The libraries bucket.

c:\curl>curl Your_IP_here:2585

libraries

To put something in the bucket, first download the library locally that you need, from a reputable source, like Python.

Now we’ll get it in our bucket. We use the curl put command, and use the write password we set earlier, along with the bucket name we specified (libraries). Here, I’m uploading the boto3 library.

c:\curl>curl -X PUT -T C:\boto3-1.4.7.tar.gz http://w:123@YOUR_IP_HERE:2585/libraries/boto3-1.4.7.tar.gz

Check your library is there by listing out the bucket.

c:\curl>curl YOUR_IP_HERE:2585/libraries

boto3-1.4.7.tar.gz

Use library in UDF script

So, we’ve uploaded our library to the bucket, but our udfs still don’t work! Your script needs the path specifying before the import statement will work.

udf

Exasol have added a video of how to do some of this here.

Personally, I wish they’d kept the GUI in ExaOperation (or a flavor of it), as it was way quicker to use, and this is such a faff to do. This is another distinction between just writing a Python script with a traditional import, and an Exasol specific thing to remember.

Anyway, it’s all there for you if you read it (from me or Exasol!), and a great reminder to study the upgrade notes and think about their implications.

 

HOW TO: Restart the licence node

Welcome back for the fourth instalment of the AdminSeries. Catch up on the previous posts here:

For this post we’ll be restarting the licence node.

To do this, I recommend you follow all the steps in the above posts beforehand.

That said, head over to the Nodes tab in ExaOperation. Scroll down past the nodes (if you have a few!).

restartlicence

In the License Nodes section, click Reboot. If all goes well, this will take down the page you are viewing, indeed all of ExaOperation (as it primarily resides on the licence node). You should be left with your browser’s page not found or something similar.

Again, I hope you have your patience ready. For me, this step takes around 5 minutes, in which cold sweats ensue (WTF have I done!). But it’s ok.

Refresh the page periodically until ExaOperation comes back.

This concludes Restarting the licence node.

HOW TO: Shutdown the Data Nodes

Time for the third post in the AdminSeries – how to shutdown the data nodes. This post follows the previous two: How to Shutdown the Database(s) and How to Shutdown the Storage. Before you read much further, go check them out first.

With that out the way, let’s get started.

In ExaOperation, head over to the Nodes tab. It should look something like this:nodeson

In this shot, you can see the State of the nodes is Running and they are Powered On.

Next up, select the checkboxes next to the Nodes that you want to shutdown. From the Actions drop down, select Shutdown, and then the button Execute.shutdownnodes

It may take some time for the nodes to shutdown, as with all things ExaOperation, be patient and wait for it. You can refresh the page and watch for the node state to change. Sometimes the Nodes change state, one at a time, so you may be waiting for some nodes.

Eventually your page should look like this. All of the Nodes will have the state of Installed, and Power Off. Your nodes are now Shutdown.

nodesoff

That brings us to the end of How to Shutdown Data Nodes. Next up, we’ll be looking at How to Restart the licence node.

HOW TO: Shutdown the Storage

The second post in the AdminSeries is going to be about how to Shutdown the Storage. Before you read this post, or are trying to do this, go back and make sure you have Shutdown the Database(s) first.

So, assuming that you have already Shutdown the Database(s), head over to the EXAStorage tab in EXAOperation (a lot of EXA’s going on there guys!).

It should look a little something like this:

storageon

I’ve got two databases, with three nodes each and a licence node. This means I have 6 volumes listed up top, and 7 disks from my nodes at the bottom. You may have more or less depending on your setup.

Click Shutdown Storage Service.

The page should then look like this:

exastorageoff

Here you can see the 7 nodes – 6 data nodes and 1 licence server.

Seeing the EXAStorage page look like this, means that the Storage is now stopped.

Having Shutdown the Database(s) and the Storage, we’ll move onto how to Shutdown the Nodes.

HOW TO: Shutdown the database(s)

This post is the first in a series of posts about how to administer Exasol. We’re going to be covering how to Shutdown databases, Shutdown storage, shutdown the nodes, restart the licence server, install CentOS patches and restart the whole lot. For all of the HOW TO: posts regarding administration, you will need access to ExaOperation.

Firstly, a quick disclaimer: This set of posts is based around 5.0.17, and so should be used in that context. If you are unsure about what you are doing, seek professional help. Do some of these things wrong, and you could jeopardise your data.

Ok, so without further ado…

In general, you should always shutdown the database first, before other shutdowns such as the storage etc. Obviously in the case of an unplanned system shutdown, you don’t have much choice.

So first log into ExaOperation, then head over to the EXASolution tab.

In my case, I have two databases running. I can tell that they are running because of the Status, and also that they are Online and behaving well – with the green light. Databases can be Running and Stopped separately. However, if you intend to perform further maintenance, like Shutting down all nodes or Stopping the Storage, you should stop them all.Inkeddatabasesrunning_LI

To Shutdown the databases, select the tick box next to the databases and then select Shutdown.

You can then manually refresh this page. It may take up to a minute to stop the databases.

When they are stopped the page looks like this:

Inkedstartupdbs_LI

We can tell that the databases are stopped, because they have a Status of “Created”, and the light indicating that the database is Online has gone out/grey.

So that’s it, your database has stopped.

The next post is then How to Shutdown the Storage.

HOW TO: Execute multiple commands from a file in EXASOL

In Exaplus we can execute multiple commands, by hitting the Execute All button, or by pressing CTRL + SHIFT + ENTER.

But what if we have a file of multiple commands? Should we copy the contents out and then execute them? Seems a bit of a faff.

Enter the @ command. Or also the START command, but I prefer @. A bit more catchy!

To run commands from a file in Exaplus, run the following:

@C:\folder\myfile.sql;

or if your file resides on FTP:

@ftp://my:ftp@ftp.scripts/myfile.sql;

I’ve found this command to be really handy, particularly in developing my own ‘automated deployment and comparison tool’ for Exasol. But that’s for another day!

Link

Another GUI, another blog

HNY2017

First off, Happy New Year from me in France! I’m out in Les Arcs, skiing for the week (best way to start the year back at work if you ask me!!).

So, in my round up of 2016, I promised you a look at a blog written by one of the guys on my team at Atheon. Vinit has written in his blog about another GUI which can interface with Exasol. An alternative to Exaplus if you will. And, best of all it comes with intellisense! I’ll let Vinit tell you the rest over at his blog optimumretrieval.