Apache Sentry is the new way to provide security (e.g. privileges on SQL statements SELECT, CREATE…) when querying data in Hadoop. Impala offers fast SQL for Apache Hadoop and can leverage Sentry. Here is how to use configure it:
First enable impersonation in the hue.ini that way permissions will be checked against the current user and not ‘hue’ which acts as a proxy:
[impala]
impersonation_enabled=True
Then you might hit this error:
User 'hue' is not authorized to impersonate 'romain'. User impersonation is disabled.
This is because Hue is not authorized to be a proxy. To fix it, startup Impala with this flag:
-authorized_proxy_user_config=hue=*
Note: if you use Cloudera Manager, add it to the ‘Impalad Command Line Argument Safety Valve’
And that’s it! You can now benefit from real security similar to Hive! As usual feel free to comment on the hue-user list or @gethue!
Note: if you are on CDH4/Hue 2.x, make sure that Hue is configured to talk to Impala with the HiveServer2 API:
[impala]
\# Host of the Impala Server (one of the Impalad)
server_host=nightly-1.ent.cloudera.com
\# The backend to contact for queries/metadata requests.
\# Choices are 'beeswax' or 'hiveserver2' (default).
\# 'hiveserver2' supports log, progress information, query cancellation
\# 'beeswax' requires Beeswax to run for proxying the metadata requests
server_interface=hiveserver2
\# Port of the Impala Server
\# Default is 21050 as HiveServer2 Thrift interface is the default.
\# Use 21000 when using Beeswax Thrift interface.
server_port=21050
\# Kerberos principal
\## impala_principal=impala/hostname.foo.com
impersonation_enabled=True
Note: to give a concrete idea, here is video demo that shows the end user interaction in the UI (it is using the Hive App but you will get the exact same result with the Impala app)