This FAQ covers general questions and problems that users can encounter in any dialect. For a dialect specific FAQs please check dialects pages.
Answer: You can create a User Defined Function and query your bucket to check if the files are in the right place. Please check the exa-toolbox repository to find out how to query a bucket.
There are also other ways to access the bucket: curl
command or BucketFS Explorer
.
You can read more about them in the official documentation.
Answer: We recommend checking the connection without Virtual Schemas and start creating Virtual Schemas after you are sure that the connection was established.
To check the connection, you need to create a connection object. You will also need this connection object to create a Virtual Schema.
Create a connection according to the dialect's user guide. A quick option to test the connection is to run an IMPORT FROM JDBC
query with it.
For example:
IMPORT FROM JDBC AT <your_connection_name>
STATEMENT 'SELECT 1 FROM <source schema>.<source table>';
If the statement was executed successfully, the connection is established correctly, and you can create a Virtual Schema using this connection.
What happens if I upload two different JDBC drivers with the same name and the same prefix, but the driver's versions are different?
Answer: The database uses the first suitable driver detected in the driver's list going through the list top-down. The driver that is higher in the list of JDBC Drivers will be used in this situation.
This chapter describes the problems that occur on the creating VS step.
A virtual schema does not recognise a property and throws the dialect-does-not-support-property exception.
VM error:
com.exasol.adapter.dialects.PropertyValidationException: The dialect <DIALECT_NAME> does not support <PROPERTY_NAME> property. Please, do not set the BIGQUERY_ENABLE_IMPORT property.
Solutions:
- Check that you don't have typos in the specified property. Properties are case-sensitive.
- Check for spaces and other hidden characters.
- Check if the dialect supports the property you are trying to use. Each dialect has its own set of supported properties. Check the dialect documentation.
- Check that you use a Virtual Schema version that supports the specified property. Check the release logs on the GitHub to find out in which version we added the property you need.
Solutions:
- Check the JDBC driver you register in the EXAoperation and make you have registered the driver correctly.
- Check if you have typos in the main class definition or prefix.
- Check for spaces and other hidden characters.
- Now check the driver itself: open the JAR archive and make sure the file
META-INF/services/java.sql.Driver
exists. - If the file exists, open it and make sure it contains the driver's main class reference you specified in the EXAoperation.
- If the file does not exist or does not contain the correct main class reference, you can add it and re-upload the fixed JAR archive. You should also report the problem to the developers of the driver.
Solutions:
- If you are using DbVisualizer, add an empty comment
--/
before the create adapter script statement:
--/
CREATE OR REPLACE JAVA ADAPTER SCRIPT ...
- If you are using DBeaver, mark the whole statement (except the trailing ;) and execute that statement in isolation (CTRL + RETURN).
I have started a CREATE VIRTUAL SCHEMA
statement, but it is running endlessly without giving any output or an error.
Solutions:
- Check if you specified a property
SCHEMA_NAME
. If you do not add this property, the virtual schema will try to read metadata of all tables existing in a source database. It can take very long time. - Check how many tables do exist in the schema you have specified. If there are more than a few hundreds of tables, creation of a virtual schema can also take time.
- Check the remote log to see whether the VS is stuck trying to establish a connection to the remote source. If so, try to reach that source directly from the same network to rule out network setup issues.
Answer: Yes, you can use TABLE_FILTER = 'TABLE1','TABLE2',... . Only these tables will be available as virtual tables, all other tables are excluded.
I created an Exasol-Exasol Virtual Schema on a view of an Exasol database, but the view does not exist in the Virtual Schema.
In that case the view was probably outdated, when you created the Virtual Schema.
This can happen if you create the view using CREATE FORCE VIEW
or you updated a table that is part of the view after the view but did not refresh the view (Exasol refreshes views for example when they are queried).
Solution:
- Refresh the view on the source (for example using
DESCRIBE MY_VIEW
). - Refresh the Virtual Schema (using
ALTER VIRTUAL SCHEMA MY_SCHEMA REFRESH
)
The virtual schema was created successfully, but when you try to run a SELECT query, you get an access denied
error with some permission name.
For example:
JDBC-Client-Error: Failed loading driver 'com.mysql.jdbc.Driver': null, access denied ("java.lang.RuntimePermission" "setContextClassLoader")
Solution:
- This happens because the JDBC driver requires more permissions than we provide by default. You can disable a security manager of the corresponding driver in EXAoperation to solve this problem.
When using Kerberos connection with Hive or Impala dialects, you may get the following exception:
com.exasol.adapter.jdbc.RemoteMetadataReaderException: Unable to create Impala remote metadata reader. Caused by: [Cloudera]ImpalaJDBCDriver Error creating login context using ticket cache: Unable to obtain Principal Name for authentication .
com.exasol.adapter.dialects.impala.ImpalaSqlDialect.createRemoteMetadataReader(ImpalaSqlDialect.java:127)
com.exasol.adapter.dialects.AbstractSqlDialect.readSchemaMetadata(AbstractSqlDialect.java:138)
com.exasol.adapter.jdbc.JdbcAdapter.readMetadata(JdbcAdapter.java:56)
Solution:
One of the reasons for this is issue might be the Kerberos service principal (hive or impala) domain name resolution. That is, even though the Kerberos realm is reachable from the Exasol node, the service domains may not be reachable due to the internal DNS settings.
You can confirm this by using ping
or dig
commands from one of the Exasol nodes.
To update the DNS settings for the Exasol cluster, go to ExaSolution → Configuration → Network. And update internal DNS server addresses and search domains entries.
Ensure that the Kerberos, krb5.conf
, file is correct.
Check that the domain realms are correct after the assignment (=
) sign. They should be specified in all capital letters. They should not include starting dot (.
) character after the assignment.
Wrong configuration file example:
...
[domain_realm]
.zone1.example.net = zone1.example.net
zone1.example.net = zone1.example.net
.example.net = .EXAMPLE.net
example.net = EXAMPLE.net
.exampl-dev.net = .EXAMPLE-DEV.NET
exampl-dev.net = EXAMPLE-DEV.NET
Correct configuration file example:
...
[domain_realm]
.zone1.example.net = ZONE1.EXAMPLE.NET
zone1.example.net = ZONE1.EXAMPLE.NET
.example.net = EXAMPLE.net
example.net = EXAMPLE.net
.example-dev.net = EXAMPLE-DEV.NET
example-dev.net = EXAMPLE-DEV.NET
Ensure that Kerberos configuration file, krb5.conf
does not contain any included directories with additional settings. All Kerberos settings should be available in the krb5.conf
configuration file.
This can cause problems when using Virtual Schema together with Kerberos connection, because included directories do not exist in the UDF container.
Wrong configuration file example:
includedir /etc/krb5.conf.d/
[logging]
default = FILE:/var/log/krb5libs.log
kdc = FILE:/var/log/krb5kdc.log
admin_server = FILE:/var/log/kadmind.log
[libdefaults]
default_realm = ZONE1.EXAMPLE.NET
dns_lookup_kdc = false
dns_lookup_realm = false
...
The includedir
folder contains a file with a setting, udp_preference_limit = 1
. Add such settings into the libdefaults
section in krb5.conf
file.
Correct configuration file example:
[logging]
default = FILE:/var/log/krb5libs.log
kdc = FILE:/var/log/krb5kdc.log
admin_server = FILE:/var/log/kadmind.log
[libdefaults]
default_realm = ZONE1.EXAMPLE.NET
dns_lookup_kdc = false
dns_lookup_realm = false
udp_preference_limit = 1
...
And then create a connection object using the modified Kerberos configuration file.
In Virtual Schema Kerberos connections, users can also use Zookeeper as service discovery for Hive or Impala servers. Zookeeper balances the connections or avoids single point of failure for Hive or Impala servers.
In this cases, set the KrbHostFQDN
property to _HOST
value. This removes hardcoded server addresses and uses connection addresses provided by Zookeeper.
Example:
jdbc:hive2://zk=zookeeper001.dev.example.com:2181/hiveserver2,zk=zookeeper002.dev.example.com:2181/hiveserver2,zk=zookeeper003.dev.example.com:2181/hiveserver2;AuthMech=1;KrbHostFQDN=_HOST;KrbRealm=ZONE1.EXAMPLE.NET;KrbAuthType=1;KrbServiceName=hive;transportMode=http;httpPath=cliservice
Answer:
- Start the remote logging to see the logs from Virtual Schemas.
- Try to use
EXPLAIN VIRTUAL
command to see a query that Virtual Schemas generate. For example:
EXPLAIN VIRTUAL SELECT * FROM MY_VIRTUAL_SCHEMA.MY_TABLE;
Answer: We have restricted an amount of mapped tables to 1000. Please, use a TABLE_FILTER property to specify the tables you need if your schema contains more than 1000 tables.
Answer: Views are the same as tables for Virtual Schemas.
Solution: If you use remote logging, a number of factors can slow down the execution of a Virtual Schema.
Those are the things you can do to improve performance:
- Make sure there is a fast network connection between the cluster nodes running the virtual schema and the machine receiving the log;
- Lower the
DEBUG_LEVEL
toINFO
orWARNING
; - Disable remote logging;
Solution: Depending on which JDK version Exasol uses to execute Java user-defined functions, a blocking random-number source may be used by default.
Especially cryptographic operations do not complete until the operating system has collected a sufficient amount of entropy (read "real random values").
This problem mostly occurs when Exasol is run in an isolated environment, typically a virtual machine or a container.
Operating systems use various sources of random data input, like keystroke timing, disk seeks and network timing. You can increase entropy by running processes in parallel that feed the entropy collection. Which ones those are depends on the OS.
Especially server machines often have dedicated hardware entropy sources. Still commodity hardware parts like sound adapters can be repurposed to create randomness, e.g. form random noise of an analog input. In order to utilize those in virtual machines you usually need drivers and / or guest extensions that allow reading random data from the host.
Since randomness is usually used for security measures like cryptography, using pseudo-random data is dangerous! Pseudo-random is another word for "guessable" and that is not what you want for cryptography.
If you intend to use this option, then do it only for integration tests with non-confidential data
- Log in to EXAOperation and shutdown the database.
- Append
-etlJdbcJavaEnv -Djava.security.egd=/dev/urandom
to the "Extra Database Parameters" input field and power the database on again.