sql - What is the difference between TRUNC and TO_DATE in Hive -


hi trying find out difference between using trunc , to_date in hive.

currently within oracle wrote following case statement against data shown below:

order_no | name | date_ | task_no abc123 | humpty | 07-oct-16 12:30:54 | 1 abc123 | humpty | 07-oct-16 12:30:54 | 2 abc123 | humpty | 07-oct-16 12:32:20 | 6  select order_no, name, date_, task_no     (case when date_ - lag(date_) on (partition order_no, name, trunc(date_) order date_) <= 1/48       0 else 1 end) count1 

and gives me result:

order_no | name | date_ | task_no | count1     abc123 | humpty | 07-oct-16 12:30:54 | 1 | 1     abc123 | humpty | 07-oct-16 12:30:54 | 2 | 0     abc123 | humpty | 07-oct-16 12:32:20 | 6 | 1 

which correct. if use same query in hive against full data set error message:

error while compiling statement: failed: semanticexception failed breakup windowing invocations groups. @ least 1 group must depend on input columns. 

so changed trunc to_date , works , gives me following results:

select order_no, name, date_, task_no (case when date_ - lag(date_) on (partition order_no, name, to_date(date_) order date_) <= 1/48       0 else 1 end) count1 

and gives me result:

order_no | name | date_ | task_no | count1 abc123 | humpty | 07-oct-16 12:30:54 | 1 | 1 abc123 | humpty | 07-oct-16 12:32:20 | 6 | 1         abc123 | humpty | 07-oct-16 12:30:54 | 2 | 1 

which different in oracle. can gather date value stored string results arent ordered in date/time , think problem lies not sure changes need make fix it.

would appreciate advice.


updated code:

select   order_no ,name ,date_fixed ,task_no ,case when date_uts - lag(date_uts) on (partition order_no, name, to_date(date_fixed) order date_fixed) <= 60*30    0 else    1 end count1 ( select order_no ,name ,task_no ,from_unixtime(unix_timestamp(date_, 'dd-mmm-yy hh:mm:ss')) date_fixed ,unix_timestamp(date_, 'dd-mmm-yy hh:mm:ss') date_uts table1 ) t 

1

hive operators , user-defined functions (udfs)

to_date

returns date part of timestamp string (pre-hive 2.1.0):
to_date("1970-01-01 00:00:00") = "1970-01-01".
of hive 2.1.0, returns date object.
prior hive 2.1.0 (hive-13248) return type string because no date type existed when method created.


trunc

returns date truncated unit specified format (as of hive 1.2.0).
supported formats: month/mon/mm, year/yyyy/yy.
example: trunc('2015-03-17', 'mm') = 2015-03-01.

2

you have errors in original query

  1. there no comma between task_no , (case when
  2. trunc in hive must take 1 parameter, , there no parameter day.
  3. there no minus operator dates (and definitly not strings). results in null.

3

the recognize date format in hive yyyy-mm-dd not match data.
applying date functions on invalid string result in null.

this how convert data format dates:

hive> select from_unixtime(unix_timestamp('07-oct-16 12:30:54','dd-mmm-yy hh:mm:ss')); ok 2016-10-07 12:30:54 

and whole query:

select  order_no        ,name        ,date_fixed        ,task_no         ,case              when    date_uts                  -   lag(date_uts) on                      (                         partition    order_no,name,to_date(date_fixed)                          order        date_fixed                     )                 <= 60*30                0             else    1         end             count1    (select  order_no                ,name                ,task_no                ,from_unixtime(unix_timestamp(date_,'dd-mmm-yy hh:mm:ss'))   date_fixed                ,unix_timestamp(date_,'dd-mmm-yy hh:mm:ss')                  date_uts             t         ) t ;         

abc123  humpty  2016-10-07 12:30:54 2   1 abc123  humpty  2016-10-07 12:30:54 1   0 abc123  humpty  2016-10-07 12:32:20 6   0 

this results when tested on oracle


Comments

Popular posts from this blog

python - How to insert QWidgets in the middle of a Layout? -

python - serve multiple gunicorn django instances under nginx ubuntu -

module - Prestashop displayPaymentReturn hook url -