This blog post details how to use Kerberos and Sentry in the Hue Search Application. If you only want to use Kerberos, just skip the paragraphs about Sentry.
Kerberos enables you to authenticate users in your Hadoop cluster. For example, it guarantees that it is really the user ‘bob’ and not ‘joe’ that is submitting a job, listing files or doing a search. Next step is configuring what the user can access, this is called authorization. Sentry is the secure way to define who can see, query, add data in the Solr collections/indexes. This is only possible as we guarantee the usernames performing the actions with Kerberos.
Hue comes with a set of collections and examples ready to be installed. However, with Kerberos, this requires a bit more than just one click.
First, make sure that you have a kerberized Cluster (and it particular Solr Search for Hue) with Sentry configured.
Make sure you use the secure version of solrconfig.xml:
solrctl instancedir -generate foosecure
cp foosecure/conf/solrconfig.xml.secure solr_configs_twitter_demo/conf/solrconfig.xml
solrctl instancedir -update twitter_demo solr_configs_twitter_demo
solrctl collection -reload twitter_demo
Then, create the collection. The command should work as-is if you have the proper Solr environment variables.
cd $HUE_HOME/apps/search/examples/bin
./create_collections.sh
You should then see the collections:
solrctl instancedir -list
jobs_demo
log_analytics_demo
twitter_demo
yelp_demo
The next step is to create the Solr cores. To keep it simple, we will just use one collection, the twitter demo. When creating the core
sudo -u systest solrctl collection -create twitter_demo -s 1
if using Sentry, you will probably see this error the first time:
Error: A call to SolrCloud WEB APIs failed: HTTP/1.1 401 Unauthorized
Server: Apache-Coyote/1.1
WWW-Authenticate: Negotiate
Set-Cookie: hadoop.auth=; Version=1; Path=/; Expires=Thu, 01-Jan-1970 00:00:00 GMT; HttpOnly
Content-Type: text/html;charset=utf-8
Content-Length: 997
Date: Thu, 11 Sep 2014 16:32:17 GMT
HTTP/1.1 401 Unauthorized Server: Apache-Coyote/1.1 WWW-Authenticate: Negotiate YGwGCSqGSIb3EgECAgIAb10wW6ADAgEFoQMCAQ+iTzBNoAMCARCiRgRE62zOpPwr+KLoFKdUX2I6FtbN0DyxSA5a8n4BSZRJMTf413TEXzJbVh3/G7jWiMasIIzeETrd0Bv8suBsuKS/HdqG068= Set-Cookie: hadoop.auth="[email protected]&t=kerberos&e=1410489137684&s=qAkcQr4ZPBkn5Ewg/Ugz/CqgLkU="; Version=1; Path=/; Expires=Fri, 12-Sep-2014 02:32:17 GMT; HttpOnly Content-Type: application/xml;charset=UTF-8 Transfer-Encoding: chunked Date: Thu, 11 Sep 2014 16:32:17 GMT401 18 org.apache.sentry.binding.solr.authz.SentrySolrAuthorizationException: User systest does not have privileges for admin 401
This is because by default our ‘systest’ user does not have permissions to create the core. ‘systest’ belongs to the ‘admin’ Unix/LDAP group and we need to create a Sentry group that includes the privileges named ‘admin’. Our ‘systest’ user needs to belongs to the group that contains this role.
In order to do this, we need to update:
/user/solr/sentry/sentry-provider.ini
with something similar to this:
[groups]
admin = admin_role
analyst = query_role
[roles]
admin_role = collection=admin->action=\*, collection=twitter_demo->action=\*
query_role = collection=twitter_demo->action=query
‘systest’ belongs to the LDAP ‘admin’ group. ‘admin’ is assigned the ‘eng_role’ role with the ‘admin’ privilege. Regular analyst users belong to the LDAP ‘analyst’ group that contains the Sentry ‘read_only’ role and its ‘query’ permission for the twitter collection. Here is the list of available permissions.
Note
The upcoming Hue 3.7 has a new Sentry App that lets you forget about sentry-provider.ini and enables you to configure the above in a Web UI. Moreover, Solr Sentry support we be integrated in Hue as soon as its API becomes available.
Then it is time to create the core and upload some data. Update the post.sh command to make it work with Kerberos.
Replace ‘curl’ by:
curl -negotiate -u: foo:bar
and make sure that you use the real hostname in the URL:
URL=http://hue-c5-sentry.ent.cloudera.com:8983/solr
A quick way to test is is to run the indexing command:
sudo -u systest curl -negotiate -u: foo:bar http://hue-c5-sentry.ent.cloudera.com:8983/solr/twitter_demo/update -data-binary @../collections/solr_configs_twitter_demo/index_data.csv -H 'Content-type:text/csv'
And that’s it! The collection with its data will appear into Solr and Hue. Depending on its group, the user can or cannot modify the collection.
Your organization can now leverage the exploration capacity of the Search app with fine grained security! Next versions will come up with field level security and a nice UI for configuring it (no more sentry-provider.ini :).
As usual feel free to comment on the hue-user list or @gethue!