What is the performance of INNER JOIN vs WHERE?

Teradata data type: NUMBER vs. INT/SMALLINT/BYTEINT. Which is better?

  • With version 14, Teradata finally support NUMBER data type which stores numeric value in ORACLE fashion. Historically, Teradata tends to advocate fixed-length data type, such as INT/BYTEINT/CHAR(n), for better performance. Even its catalog/metadata tables are full of CHAR(30). In 14, will NUMBER replace all the binary numeric types? NUMBER is variable-length, which is more efficient for storage, but what's the negative impact of using NUMBER in Teradata? How about the join  between NUMBER(10) and NUMBER(22)  between NUMBER(8) and INT  between NUMBER and NUMBER(15,2) ? Will TD rehash/redistribute the data across AMP because the data types on both sides of the join don't perfectly match (even though the value are the same)? Just tested it in 14.00 last week. Here is the answer: * HashRow() generates different values for NUMBER vs. INT * HashRow() generates the same value for DECIMAL(8) and INT * RowHash match scan will happen for the join between DECIMAL(8) and INT * RowHash match scan will happen for the join between NUMBER(10) and NUMBER(22) * fan out and re-hash will happen for the join between NUMBER and INT

  • Answer:

    As stated in question, fixed-length data types are used historically, and lot of enhancements are done to extend support of variable-length data types (like multi-value compression for VARCHAR in Teradata 13.10). But NUMBER data type will really be useful only as a replacement to DECIMAL and FLOAT, not for INTEGER, as different approaches to calculation are used. As for joins, I wasn't able to find anything about how hashing is done for NUMBER data type (and thus, how indexes are distributed across AMPs). But I made some experiments, and here's what I've found: All different NUMBER types are hashed to the same value as long as there is no need to rounding. This makes sense, without rounding, mantissa and exponent [1 p.131] will be the same and internal representation of number will be the same. For the same value, NUMBER data type will have hash different from INTEGER, FLOAT and DECIMAL. Same value casted to NUMBER with different precision will have different hashes (due to rounding). Answering you specific examples, for cases, when Table1 and Table2 have following primary indexes:    A: Table1 has index on NUMBER(10) and Table2 on NUMBER(22).    B: Table1 has index on NUMBER(8) and Table2 on INTEGER.    C: Table1 has index on NUMBER and Table2 on NUMBER(15,2). Hash join (no data redistribution) will occur only for cases A and C. [1] SQL Data Types and Literals  -  http://www.info.teradata.com/edownload.cfm?itemid=113480015

Alexander Bessonov at Quora Visit the source

Was this solution helpful to you?

Related Q & A:

Just Added Q & A:

Find solution

For every problem there is a solution! Proved by Solucija.

  • Got an issue and looking for advice?

  • Ask Solucija to search every corner of the Web for help.

  • Get workable solutions and helpful tips in a moment.

Just ask Solucija about an issue you face and immediately get a list of ready solutions, answers and tips from other Internet users. We always provide the most suitable and complete answer to your question at the top, along with a few good alternatives below.