Tuesday, April 11, 2017

Difference between GROUP and COGROUP in Pig

GROUP

Groups the data in one or more relations.
Note: The GROUP and COGROUP operators are identical. Both operators work with one or more relations. For readability GROUP is used in statements involving one relation and COGROUP is used in statements involving two or more relations. You can COGROUP up to but no more than 127 relations at a time.

Syntax

alias = GROUP alias { ALL | BY expression} [, alias ALL | BY expression …] [USING 'collected' | 'merge'] [PARTITION BY partitioner] [PARALLEL n];

Example:

X = COGROUP A BY owner, B BY friend2;

DESCRIBE X;
X: {group: chararray,A: {owner: chararray,pet: chararray},B: {friend1: chararray,friend2: chararray}}

1 comment:

  1. I was looking for the data migration consulting companies through which I can identify the data migration procedure accurately. I must say the solutions, as well as the services offered by your company, have helped in moving the viable data.

    ReplyDelete