Create Hive tables with headers and load quoted CSV data

Create Hive tables with headers and load quoted CSV data

Hue makes it easy to create Hive tables.

With HUE-1746, Hue guesses the columns names and types (int, string, float…) directly by looking at your data. If your data starts with a header, this one will automatically be used and skipped while creating the table.

Quoted CSV fields are also compatible thanks to HUE-1747.

Here is the data file used:

http://www.fdic.gov/bank/individual/failed/banklist.html

 

This is the SerDe for reading quoted CSV:

https://github.com/ogrodnek/csv-serde

 

And the command to switch the SerDe used by the table:

ALTER TABLE banks SET SERDE 'com.bizo.hive.serde.csv.CSVSerde'

Now go analyze the data with the Hive, Impala or Pig editors!

8 Comments

  1. clancey 3 years ago

    accent was a little difficult to understand but incredibly helpful! thank you for uploading

  2. Max Dumas 3 years ago

    Allo Romain
    I tried with Impala after running fine in Hive and get this error:
    “AnalysisException: Failed to load metadata for table: default.banks CAUSED BY: TableLoadingException: Failed to load metadata for table: banks CAUSED BY: InvalidStorageDescriptorException: Impala does not support tables of this type. REASON: SerDe library ‘com.bizo.hive.serde.csv.CSVSerde’ is not supported.”

  3. SwapnilG 2 years ago

    Hi Romain,

    I am getting following error in the last step, while altering the table:
    Error while processing statement: invalid url: maprfs:////user/mapr/csv-serde-1.1.2-0.11.0-all.jar, expecting ( file | hdfs | ivy) as url scheme.

    As per your directions, the command I ran to alter my table is:
    ALTER TABLE crash_data SET SERDE ‘com.bizo.hive.serde.csv.CSVSerde’;

    I am using mapr cluster and I have already uploaded csv serde jar using HUE file browser. Is it necessary to move csv serde jar file on maprfs in order to make this query work?

    • Hue Team 2 years ago

      This is an error coming from Hive not Hue right?

  4. Ryan 3 weeks ago

    The download link on the github page for the CSV Support tool is broken. The file doesn’t exist anymore.

Leave a reply

Your email address will not be published. Required fields are marked *

*