What happens to the Identity value when you Truncate a table in Exasol?

Today was challenging at work for many reasons, but it was a good day for learning new things!

I thought I’d share with you a nugget that I learnt today.

When I truncate a table, I expect the all of the data to be permanently deleted, essentially like it was never there. For all indexes to be emptied, and any auto-incrementing constraints to be reset.

Let’s give it a go then… *simples*…no?

CREATE TABLE ID_TESTING(
ID DECIMAL(18,0) IDENTITY NOT NULL ENABLE,
A DECIMAL(18,0),
B DECIMAL(18,0)
);

I then insert 3 rows:

insert into ID_TESTING (A,B)
SELECT 1,2;
insert into ID_TESTING (A,B)
SELECT 2,2;
insert into ID_TESTING (A,B)
SELECT 3,2;

This gives me 3 rows, with an identity value of 3, which we can check by looking in one of the sys tables.

SELECT column_identity
FROM SYS.EXA_ALL_COLUMNS
WHERE column_schema = 'DUMMY'
AND column_table = 'ID_TESTING'
AND column_name = 'ID';

Finally, I truncate the table;

truncate table ID_TESTING;

So if we check the identity value this should be zero right? Using the same query as above, we can see it remains at 3.

This is obviously not ideal, if you’re loading large amounts of data, or have not set your identity column to be DECIMAL(18,0), or you’re just plain expecting a counter to start at 1. Having got in touch with Exasol about this, as it just doesn’t make sense to me, it seems it had been raised as an issue previously, logged as IDEA 177, but has been discounted. As such, it makes this behaviour, accepted.

If, like me, this doesn’t really work for you; we need a workaround…

…I’ll cover it in my next blog post!

 

HOW TO: Find out which features are covered by your Exasol licence?

Following on from the last post about how to build an Exasol equivalent of a Recursive CTE using Graph Search, you probably need to know if you have that feature available to you.

To do this, you’ll need access to ExaOperation.

In ExaOperation, head over to Configuration>Software>Licence.

licence

Here I’ve commented out the General licence info, for obvious reasons. Down the bottom, under Extra Database Features, you can see the Graph search option. Here it says ‘Unlimited’, but also valid is a date. ‘not available’ means that you do not have that functionality as part of your licence. To get features added to your licence, contact your Account Manager.

HOW TO: Recursive Queries Using Common Table Expressions (CTE)

CTEs are pretty standard. Recursive CTEs are pretty mainstream too. They’re great for calculations amongst a hierarchy, think manger-employee type relationship; both are employees and as such sit in the same table. To analyse this hierarchy, you would need a recurring self join. This same technique can be used to analyse non-hierarchical problems too. Microsoft have got a pretty comprehensive guide to Recursive CTEs here, if you would like to understand that first.

So, the problem: Charity shops have tins, for donations on the counters. Every day that donations are made, the total donation is emptied from the tin, and the total donation recorded against a store and a day. Looking back, we now want to know how many days it is until the next donation is made, and from then, the next and so on in each store. Then we could do some trend analysis on that data about how often stores are receiving donations.

Let’s have a look at the sort of thing we might have wanted to do before in SQL Server, and then we’ll get into how we’ll go about doing it in Exasol.

This is the setup code for SQL:

CREATE TABLE TBL_STOREDONATIONS (
    STOREID       DECIMAL(18,0),
    DONATIONDATE  DATE,
    DONATIONVALUE DECIMAL(18,2)
);

INSERT INTO CTE_EXAMPLE.DBO.TBL_STOREDONATIONS
SELECT 1, '2016-01-12', '3.00'
UNION ALL SELECT 1, '2016-01-20', '5.00'
UNION ALL SELECT 1, '2016-01-20', '4.20'
UNION ALL SELECT 2, '2016-01-14', '1.00'
UNION ALL SELECT 2, '2016-01-15', '3.40'
UNION ALL SELECT 1, '2016-01-29', '5.50'
UNION ALL SELECT 1, '2016-01-30', '2.10'
UNION ALL SELECT 3, '2016-01-17', '7.80'
UNION ALL SELECT 3, '2016-01-27', '1.00'
UNION ALL SELECT 3, '2016-01-28', '1.20'
UNION ALL SELECT 4, '2016-01-30', '5.44'
UNION ALL SELECT 1, '2016-02-03', '1.23';

Here’s the code for the Recursive CTE:

WITH Donations (StoreID, DonationDate, DonationOrder, DaysToNext)
AS
(
-- Anchor member definition
    SELECT  e.StoreID
		, e.DonationDate 
		, rnk
		, null
    FROM tbl_storedonations AS e
	INNER JOIN(
		SELECT Storeid
		, donationdate
		, DENSE_RANK() OVER (PARTITION BY storeid ORDER BY donationdate DESC) rnk
		FROM TBL_STOREDONATIONS
		)rnkd
	ON e.storeid = rnkd.storeid
	AND e.donationdate = rnkd.donationdate
	WHERE rnk =1 --get last donation    
    UNION ALL
-- Recursive member definition
    SELECT e.StoreID
		, e.DonationDate
		,rnk
		,DATEDIFF(dd, e.donationdate,d.DonationDate) 
    FROM dbo.TBL_STOREDONATIONS AS e
	INNER JOIN(
		SELECT Storeid
		, donationdate
		, DENSE_RANK() OVER (PARTITION BY storeid ORDER BY donationdate DESC) rnk
		FROM TBL_STOREDONATIONS
		)rnkd
	ON e.storeid = rnkd.storeid
	AND e.donationdate = rnkd.donationdate
    INNER JOIN Donations AS d
    ON e.StoreId = d.StoreID
	AND rnk = donationorder + 1
)
-- Statement that executes the CTE
SELECT storeid
, donationdate
, daystonext
FROM Donations
ORDER BY storeid ASC, donationdate DESC
GO

Here’s the output, exactly what we wanted.

SQL_CTE_OUTPUT

Unfortunately, this is not natively supported in Exasol, using recursive CTEs.

There is a light at the end of the tunnel though. We can build something we the same output, by using the Exasol feature, Graph search.

Here’s the Exasol setup code to this demo:

CREATE SCHEMA CTE_EXAMPLE;

CREATE TABLE TBL_STOREDONATIONS (
    STOREID       DECIMAL(18,0),
    DONATIONDATE  DATE,
    DONATIONVALUE DECIMAL(18,2)
);

INSERT INTO CTE_EXAMPLE.TBL_STOREDONATIONS
SELECT 1, '2016-01-12', '3.00'
UNION ALL SELECT 1, '2016-01-20', '5.00'
UNION ALL SELECT 1, '2016-01-20', '4.20'
UNION ALL SELECT 2, '2016-01-14', '1.00'
UNION ALL SELECT 2, '2016-01-15', '3.40'
UNION ALL SELECT 1, '2016-01-29', '5.50'
UNION ALL SELECT 1, '2016-01-30', '2.10'
UNION ALL SELECT 3, '2016-01-17', '7.80'
UNION ALL SELECT 3, '2016-01-27', '1.00'
UNION ALL SELECT 3, '2016-01-28', '1.20'
UNION ALL SELECT 4, '2016-01-30', '5.44'
UNION ALL SELECT 1, '2016-02-03', '1.23';

Here, we’re going to use the Graph Search keyword of CONNECT BY. We’re telling Exasol to join the single result set, back on itself, on the StoreId attribute and the ranking r_instore, by looking for the next instance of a donation in store. r_instore is also used to start the calculation at the last instance of a donation in a store.
This code is undoubtedly tidier and much more readable than the SQL Server equivalent.

SELECT * FROM(
SELECT 
d.StoreId
, d.DonationDate
, d.r_instore
, DAYS_BETWEEN(PRIOR donationdate, donationdate) AS DaysToNext
FROM (
SELECT distinct
StoreId
, DonationDate
, DENSE_RANK() OVER (PARTITION BY StoreId ORDER BY DonationDate desc) r_instore
FROM tbl_StoreDonations
) d

CONNECT BY
StoreId = PRIOR StoreId and
r_instore = PRIOR r_instore + 1

START WITH
r_instore = 1)DTN
ORDER BY STOREID ASC
, DONATIONDATE ASC;

Here’s the output from Exasol, again just what we needed.EXA_CTE_OUTPUT

It is important to note, that to use this functionality, you will need Graph Search as part of your Exasol licence.

Error “Can not find the Java Runtime Environment needed by EXAplus GUI. Trying to use system wide virtual machine (from registry) instead. No 64 bit java virtual machine found in registry.”

Don’t you just love that new machine setup process…said nobody ever. Today I got hold of my new machine at work (which is great, thanks guys!). First job is to get all of the programs on that I need.

After the essentials, I got round to installing Exaplus. You can get it from here

I ran the install and got this error on completion:

Can not find the Java Runtime Environment needed by EXAplus GUI. Trying to use system wide virtual machine (from registry) instead. No 64 bit java virtual machine found in registry.

This rang a bell, and was quite similar to this error I’ve had before.

This time round, it was because I didn’t have the 64 bit Java. To fix it, I got the 64 bit download for Java from here.

The install did not fail, it was just the opening of Exaplus that was causing the error. With 64 bit Java installed, Exaplus loaded fine.,

 

 

SQL Relay 2016

SQLRelay

So if you’re not attending the Exasol breakfast on Friday, and data is your thing, how about SQL Relay?

Admittedly I’m a bit late on the bandwagon here. This years’ SQL Relay is already halfway through, with the Birmingham, Cardiff and Reading events already done.

Tomorrow, 5th October, is Nottingham, with the Relay wrapping up in Leeds on Friday 6th October.

I’m heading to Relay in Leeds. There’s 3 main tracks of talks, with the added bonus of a workshop track. Best of all it’s free!

My itinerary is as follows:

9.30-12.30 – Workshop on Azure Machine Learning with Amy Nicholson and Andrew Fryer from Microsoft

12.30-2.30 – Lunch / Networking (come find me!)

2.30-3.30 – DatOps Journey at Sky Betting with Andy Burgin

3.45-4.40 – Simon Whiteley discusses Diving into Data Lakes

If you know of anymore conferences free or otherwise that you think could be interesting,  get in touch and let me know.

Error: [27002] constraint violation – primary key (PK_ on table TABLE) (Session: )

You might get this error if you enter in a duplicate value in your primary key column. Seems pretty standard and fair enough. But what about when you’re not?

This error can occur when you have an IDENTITY primary key column, that you have entered values into.

For example my table looked something like this:

CREATE TABLE DIM_TBL_STORE (
STOREID DECIMAL(18,0) IDENTITY NOT NULL ENABLE,
STORENAME VARCHAR(250) UTF8);
ALTER TABLE DIM_TBL_STORE ADD CONSTRAINT PK_DIM_TBL_STORE PRIMARY KEY(STOREID) ENABLE;

We can then check the value of the identity column; COLUMN_IDENTITY returns 0 as expected. All good.

select * from SYS.EXA_ALL_COLUMNS
where Column_schema = 'ROSE3'
AND COLUMN_TABLE = 'DIM_TBL_STORE'
AND COLUMN_NAME = 'STOREID';

Let’s then add some core stores with set IDs :

INSERT INTO DIM_TBL_STORE
SELECT 1, "StoreA"
UNION SELECT 2, "StoreB"
UNION SELECT 3, "StoreC"

SELECT * FROM DIM_TBL_STORE

This shows the 3 rows as expected.

What is unexpected is that the identity value is still zero. You can check this by running the exa_all_columns query, or looking at the info for the table in exaplus.

IdentityColumnWithZero

So next, let’s add a new store, this time not specifying a value for storeid.

INSERT INTO DIM_TBL_STORE (STORENAME)
SELECT 'STORED';

From this we get the error

Error: [27002] constraint violation - primary key (PK_ on table TABLE) (Session: )

Oops. Now we can’t enter any data, because the identity value being offered up by the table is 1. But 1 is taken! So, in fact, are 2 and 3. We need value 4.

To do this, the Identity value needs to be reset to a value after the max identity of the table.

ALTER TABLE DIM_TBL_STORE
ALTER COLUMN storeid SET IDENTITY 4;

Checking the exa_all_columns table, you can now see the value being 4.

The insert will now succeed.

In SQL server, we would have got around this by toggling IDENTITY_INSERT on or off, and SQL Server would have dealt with securing us the next unique value. With Exasol we’ve got to manage it a bit more.

5.0.16 and another teaser for Version 6

Last week Exasol sent out a newsletter teasing Version 6 for release in October. Exciting times!

As a precursor, there’s also been a release as version 5.0.16.

Some key notes were that the JDBC driver now requires at least Java 1.5, with Java 1.4 no longer compatible.

The release of 5.0.16 covers quite a few bugs, but also the addition of functionality for Import and Export to Amazon S3 buckets, which is a good addition.

Bug Wrong results in some combinations of equi and cross join
Bug Avoid unnecessary overhead in commit when a table is created and dropped within the same transaction
Bug UNION ALL queries may produce wrong results in some special cases
Bug Possible IMPORT failure when using ERRORS INTO clause
Bug SqlProcess terminates with SIGSEGV in cases of many expressions in select list
Bug AWS node names during boot (Stage 2-3) are wrong (n267,n268,n269)
Bug AWS cannot reboot or shut down nodes through EXAoperation
Bug EXAoperation possibly shows wrong database connection strings in Amazon AWS™ clusters
Bug Undetected node malfunction in case of only partially responsive nodes when using remote syslog servers
Bug Unusually high process-load
Bug Unexpected data exception by using GROUP_CONCAT in the HAVING clause
Bug Network reconfiguration via maintenance user may lead to misleading error message
Bug Reduced usability and transaction throughput in the case when client aborts occur during big commit
Bug ODBC: values may be truncated, when retrieving DECIMAL values as SQL_C_WCHAR
Bug Running query profiling causes reduced resource values in last day profiling views
New Feature Import/Export from/to Amazon S3™ buckets
Improvement EXAoperation: Replacing HTTPS certificates for clusters installed before version 5.0.7
Improvement EXADataProcessingExtension for VS 2013
Bug An unlikely race condition can deadlock EXASolution and EXAoperation
Bug Doubled audit entry through reconnects caused by networking issues
Bug Unexpected NULL values in RESOURCES column of auditing and SQL statistics
Improvement Improved errors messages for failed parallel connections

For more information on 5.0.16, or to download the update, take a look here on Exasol’s site.

Link

Breakfast with Exasol

coffee-1569682_1280

Exasol are putting on an event in central London on the 7th October 2016 for you to get to know more about them.

Starting with breakfast at the Eight Members Club in Moorgate, they’ll be hosting great speakers from King (Andy Done) and Adidas, discussing how they make the most out of their data with Exasol. After the talks there is a networking session, for you to network with existing and prospective Exasol users.

Find out more about the event and register to attend for free here.

The talks were great at exaxp16, and I’d recommend giving it a go if you’re interested in Exasol.

Link

Exasol in my backpack

Exasol turned out to be quite interested in the mini cluster that I developed and talked about here.

So much so, that my cluster has now been christened as “Exasol in my backpack”, and featured:

It’s great to know that people are interested in the stuff I’m building and sharing. As always, if you’d like to know more, please get in touch.

 

Error “insufficient privileges for using connection”

First of all, if you’d like to know how to setup a connection from scratch, please check out this post I wrote on setting up an SFTP connection in Exasol.

However, if you’ve just got the error “insufficient privileges for using connection”, chances are you forgot to grant permissions to some other user, who has tried to use this connection.

GRANT CONNECTION MY_SFTP_CONNECTION to ANOTHER_EXASOL_USER;

This will allow the user to use the connection in conjunction with any other database object they have permissions to use.