Simplifying the end user Data Catalog search

01 May 2018 in Querying / Version 4 - 2 minutes read

Data Catalog Search

Before typing any query to get insights, users need to find and explore the correct datasets. The Data Catalog search usability experience has been improved in each release since. It is accessible from the top bar of the interface and offers free text search of SQL tables, columns, tags and saved queries. This is particularly useful for quickly looking up a table among thousands or finding existing queries already analysing a certain dataset.

In this iteration, the search now provides more results directly via the ‘Show more’ link. Existing tags can be faceted simply by typing ‘tags:‘.

Some example of searches:

  • usage → Returns any table matching ‘usage’ in its name, description or tags.
  • type:view customer → Find the customer view
  • tax* tags:finance → List all the tables and views starting with tax and tagged with ‘finance’
  •  

    Searching all the available queries or data in the cluster

    Listing the possible tags to filter on. This also works for 'types'.

    Unification and Caching of all SQL metadata

    The list of tables and their columns is displayed in multiple part of the interface. This data is pretty costly to fetch and comes from different sources. In this new version, the information is now cached and reused by all the Hue components. As the sources are diverse, e.g. Apache Hive, Cloudera Navigator, Cloudera Optimizer those are stored into a single object, so that it is easier and faster to display without caring about the underlying technical details.

     

    In addition to editing the tags of any SQL objects like tables, views, columns… which has been available since version one, table descriptions can now also be edited. This allows a self service documentation of the metadata by the end users, which was not possible until know as directly editing Hive comments require some admin Sentry privileges which are not granted to regular users in a secure cluster.

     

    In the upcoming version,this information is also reused on the Catalog pages.

    Showing all the common data now cached and unified for a slicker experience

     

     

    As usual thank you to all the project contributors and for sending feedback and participating on the hue-user list or @gethue!


    comments powered by Disqus

    More recent stories

    13 November 2019
    Visually surfacing SQL information like Primary Keys, Foreign Keys, Views and Complex Types
    Read More
    31 October 2019
    Missing some color? How to improve or add your own SQL syntax Highlighter
    Read More
    24 October 2019
    How to create a HBase table on Kerberized Hadoop clusters
    Read More