Advanced Analytics - Presto Functions and Operators Quick Review -

This post is a lot different from our earlier entries. Think of it as a reference flag post for people interested in a quick lookup for advanced analytics functions and operators used in modern data lake operations based on Presto. So you could, of course, use it in Presto installations, but also in some other commercial products such as AWS Athena that is used widely these days to facilitate analytic operations on Enterprise Data Lakes built on top of Amazon S3.

Without further ado, let’s dive straight into the nuts and bolts of these queries for advanced analytics:

JSON Functions

is_json_scalar(json) → boolean

Determines if json is a scalar (i.e. a JSON number, a JSON string, true, false or null).
Example:

Query	Output
SELECT is_json_scalar(‘1’)	true
SELECT is_json_scalar(‘[1, 2, 3]’)	false

json_array_contains(json, value) → boolean

Determines if value exists in json (a string containing a JSON array)
Example:

Query	Output
SELECT json_array_contains(‘[1, 2, 3]’, 2)	true

json_array_get(json_array, index) → json

Example:

Query	Output
SELECT json_array_get(‘[“a”, [3, 9], “c”]’, 0)	a
SELECT json_array_get(‘[“a”, [3, 9], “c”]’, 10)	null
SELECT json_array_get(‘[“c”, [3, 9], “a”]’, -2)	JSON ‘[3,9]’

json_array_length(json) → bigint

Returns the array length of json (a string containing a JSON array)
Example:

Query	Output
SELECT json_array_length(‘[1, 2, 3]’)	3

json_extract(json, json_path) → json

Evaluates the JSONPath-like expression json_path on json (a string containing JSON) and returns the result as a JSON string
Example:

Query	Output
SELECT json_extract(json_parse(‘{“email”: {“abcd”: “xyz@yahoo.com”}, “phone_numbers”: [5678908, 587578575, 668798]}’), ‘$.email.abc’)	“xyz@yahoo.com”

json_extract_scalar(json, json_path) → varchar

Just like json_extract(), but returns the result value as a string (as opposed to being encoded as JSON). The value referenced by json_path must be a scalar (boolean, number or string)
Example:

Query	Output
SELECT json_extract_scalar(json_parse(‘{“email”: {“abcd”: “xyz@yahoo.com”}, “phone_numbers”: [5678908, 587578575, 9999999]}’), ‘$.phone_numbers[0]’)	9999999
SELECT json_extract_scalar(json_parse(‘{“email”: {“abcd”: “xyz@yahoo.com”}, “phone_numbers”: [{“home”: 5678908}, {“mob”: 587578575}, {“cell”: 9999999}]}’), ‘$.phone_numbers[1].mob’)	587578575
SELECT json_extract_scalar(json_parse(‘{“email”: “xyz@yahoo.com”, “phone_numbers”: [{“home”: 5678908}, {“mob”: 587578575}, {“cell”: 9999999}]}’), ‘$[email]’)	xyz@yahoo.com

json_parse(string) → json

Returns the JSON value deserialized from the input JSON text. This is an inverse function to json_format()
Example:

Query	Output
SELECT json_parse(‘{“email”: {“abcd”: “xyz@yahoo.com”}, “phone_numbers”: [{“home”: 5678908}, {“mob”: 587578575}, {“cell”: 9999999}]}’)	{“email”: {“abcd”: “xyz@yahoo.com”}, “phone_numbers”: [{“home”: 5678908}, {“mob”: 587578575}, {“cell”: 9999999}]}

json_size(json, json_path) → bigint

Just like json_extract(), but returns the size of the value. For objects or arrays, the size is the number of members, and the size of a scalar value is zero.
Example:

Query	Output
SELECT json_size(json_parse(‘{“email”: “xyz@yahoo.com”, “phone_numbers”: [{“home”: 5678908}, {“mob”: 587578575}, {“cell”: 9999999}]}’), ‘$.phone_numbers’)	3
SELECT json_size(‘{“x”: {“a”: 1, “b”: 2}}’, ‘$.x.a’)	0

Date and Time Functions and Operators

current_date → date

Returns the current date as of the start of the query.

current_time → time with time zone

Returns the current time as of the start of the query. (with UTC)

current_timestamp → timestamp with time zone

Returns the current timestamp as of the start of the query.

current_timezone() → varchar

Returns the current time zone in the format defined by IANA (e.g., America/Los_Angeles) or as fixed offset from UTC (Example: +08:35)

date(x) → date

This is an alias for CAST(x AS date).
Example:

Query	Output
SELECT DATE(‘2019-08-07’)	SELECT CAST(‘2019-08-07’ AS DATE);

now() → timestamp with time zone

This is an alias for current_timestamp.
SELECT now();

date_trunc(unit, x) → [same as input]

Returns x truncated to unit.
Example:

Query	Output
SELECT date_trunc(‘second’, current_timestamp)	2019-08-16 06:51:29.000 UTC (returns value upto unit)
SELECT date_trunc(‘minute’, current_timestamp)	2019-08-16 06:54:00.000 UTC

Interval Functions

Unit	Description
millisecond	Milliseconds
second	Seconds
minute	Minutes
hour	Hours
day	Days
week	Weeks
month	Months
quarter	Quarters of a year
year	Year

date_add(unit, value, timestamp) → [same as input]

Adds an interval value of type unit to timestamp. Subtraction can be performed by using a negative value.
Example:

Query	Output
SELECT date_add(‘month’, 1, current_timestamp)	2019-09-16 06:59:55.425 UTC
SELECT date_add(‘day’, 1, current_timestamp)	2019-08-17 07:01:26.834 UTC

date_diff(unit, timestamp1, timestamp2) → bigint

Returns timestamp2 – timestamp1 expressed in terms of unit.
Example:

Query	Output
SELECT date_diff(‘day’, current_timestamp, date_add(‘day’, 10, current_timestamp))	10

parse_duration(string) → interval

Parses string of format value unit into an interval, where value is fractional number of unit values
Example:

Query	Output
SELECT parse_duration(‘42.8ms’)	0 00:00:00.043
SELECT parse_duration(‘3.81 d’)	3 19:26:24.000
SELECT parse_duration(‘5m’)	0 00:05:00.000

date_format(timestamp, format) → varchar

Formats timestamp as a string using format (converts timestamp to string)
Example:

Query	Output
SELECT date_format(current_timestamp, ‘%Y-%m-%d’)	2019-08-16

date_parse(string, format) → timestamp

Parses string into a timestamp using format. (converts string to timestamp)
Example:

Query	Output
SELECT date_parse(‘2019-08-16’, ‘%Y-%m-%d’)	2019-08-16 00:00:00.000

day(x) → bigint

Returns the day of the month from x.
Example:

Query	Output
SELECT day(date_parse(‘2019-08-16’, ‘%Y-%m-%d’))	16

day_of_month(x) → bigint

This is an alias for day().

day_of_week(x) → bigint

Returns the ISO day of the week from x. The value ranges from 1 (Monday) to 7 (Sunday).
Example:

Query	Output
SELECT day_of_week(date_parse(‘2019-08-16’, ‘%Y-%m-%d’))	5

year(x) → bigint

Returns the year from x.

Aggregate Functions

array_agg(x) → array<[same as input]>

Returns an array created from the input x elements
The array_agg() function is an aggregate function that accepts a set of values and returns an array in which each value in the input set is assigned to an element of the array.
Syntax:
array_agg(expression [ORDER BY [sort_expression {ASC | DESC}], […])
Example:

Query

SELECT title, array_agg (first_name || ‘ ‘ || last_name) actors FROM film

SELECT title, array_agg (first_name || ‘ ‘ || last_name ORDER BY first_name) actors FROM film

avg(x) → double

Returns the average (arithmetic mean) of all input values

bool_and(boolean) → boolean

Returns TRUE if every input value is TRUE, otherwise FALSE.

bool_or(boolean) → boolean

Returns TRUE if any input value is TRUE, otherwise FALSE.

count(*) → bigint

Returns the number of input rows.

count(x) → bigint

Returns the number of non-null input values.

count_if(x) → bigint

Returns the number of TRUE input values. This function is equivalent to count(CASE WHEN x THEN 1 END).

arbitrary(x) → [same as input]

Returns an arbitrary non-null value of x, if one exists.
Arbitrary chooses one value out of a set of values. arbitrary is useful for silencing warnings about values neither grouped by or aggregated over.

max_by(x, y) → [same as x]

Returns the value of x associated with the maximum value of y over all input values.
The max_by takes two arguments and returns the value of the first argument for which the value of the second argument is maximized.
If multiple rows maximize the result of the second value, and arbitrary first value is chosen from among them. max_by can be used with both numeric and non-numeric data.
Example:

Query	Output
SELECT max_by(close_date, close_value) as date_of_max_sale FROM sales_pipeline	query returns the date where close_value is maximum

max_by(x, y, n) → array<[same as x]>

Returns n values of x associated with the n largest of all input values of y in descending order of y.

min_by(x, y) → [same as x]

Returns the value of x associated with the minimum value of y over all input values.

min_by(x, y, n) → array<[same as x]>

Returns n values of x associated with the n smallest of all input values of y in ascending order of y.

max(x, n) → array<[same as x]>

Returns n largest values of all input values of x.

min(x) → [same as input]

Returns the minimum value of all input values.

min(x, n) → array<[same as x]>

Returns n smallest values of all input values of x.

reduce_agg(inputValue T, initialState S, inputFunction(S, T, S), combineFunction(S, S, S)) → S

Reduces all input values into a single value. inputFunction will be invoked for each input value. In addition to taking the input value, inputFunction takes the current state, initially initialState, and returns the new state. combineFunction will be invoked to combine two states into a new state. The final state is returned
Example:

Query	Output
SELECT id, reduce_agg(value, 0, (a, b) -> a + b, (a, b) -> a + b) FROM (VALUES (1, 2) (1, 3), (1, 4), (2, 20), (2, 30), (2, 40) ) AS t(id, value) GROUP BY id	(1, 9), (2, 90)

Array Functions and Operators

Subscript Operator: [ ]

The [ ] operator is used to access an element of an array and is indexed starting from one:
If a column has values like [1, 3], [45, 46]
Example:

Query	Output
SELECT column[1]	1, 45

Concatenation Operator: ||

The || operator is used to concatenate an array with an array or an element of the same type.
Example:

Query	Output
SELECT ARRAY [1] \|\| ARRAY [2]	[1, 2]
SELECT ARRAY [1] \|\| 2	[1, 2]
SELECT 2 \|\| ARRAY [1]	[2, 1]

array_distinct(x) → array

Remove duplicate values from the array x.
Example:

Query	Output
SELECT array_distinct(ARRAY[1, 1, 1, 2])	[1, 2]

array_intersect(x, y) → array

Returns an array of the elements in the intersection of x and y, without duplicates.
Common elements in two arrays without duplicates.
Example:

Query	Output
SELECT array_intersect(ARRAY[1, 1, 1, 2], ARRAY[10, 15, 1, 100])	[1]

array_union(x, y) → array

Returns an array of the elements in the union of x and y, without duplicates.
Example:

Query	Output
SELECT array_union(ARRAY[1, 1, 1, 2], ARRAY[10, 15, 1, 100])	[1, 2, 10, 15, 100]

array_except(x, y) → array

Returns an array of elements in x but not in y, without duplicates.
Example:

Query	Output
SELECT array_except(ARRAY[1, 1, 1, 2], ARRAY[10, 15, 1, 100])	[2]

array_join(x, delimiter, null_replacement) → varchar

Concatenates the elements of the given array using the delimiter and an optional string to replace nulls.
Example:

Query	Output
SELECT array_join(ARRAY[1, 71, 81, 92, null], ‘/’, ‘abcd’)	1/71/81/92/abcd

array_max(x) → x

Returns the maximum value of input array.
Example:

Query	Output
SELECT array_max(ARRAY[1, 71, 81, 92, 100])	100
SELECT array_max(ARRAY[1, 71, 81, 92, null])	null

array_min(x) → x

Returns the minimum value of input array.
Example:

Query	Output
SELECT array_min(ARRAY[1, 71, 81, 92, 100])	1

array_position(x, element) → bigint

Returns the position of the first occurrence of the element in array x (or 0 if not found).
Example:

Query	Output
SELECT array_position(ARRAY[1, 71, 81, 92, 100], 100)	5

array_remove(x, element) → array

Remove all elements that equal element from array x.
Example:

Query	Output
SELECT array_remove(ARRAY[1, 71, 81, 92, 100, 1], 1)	[71, 81, 92, 100]

array_sort(x) → array

Sorts and returns the array x. The elements of x must be orderable. Null elements will be placed at the end of the returned array.
Example:

Query	Output
SELECT array_sort(ARRAY[1, null, 8, 9, 71, 81, 92, 100, 12, 51, 10, 7, 1, null])	[1, 1, 7, 8, 9, 10, 12, 51, 71, 81, 92, 100, null, null]

array_sort(array(T), function(T, T, int)) → array(T)

Sorts and returns the array based on the given comparator function. The comparator will take two nullable arguments representing two nullable elements of the array. It returns -1, 0, or 1 as the first nullable element is less than, equal to, or greater than the second nullable element.
If the comparator function returns other values (including NULL), the query will fail and raise an error.
Example:

Query	Output
SELECT array_sort(ARRAY [3, 2, 5, 1, 2], (x, y) -> IF(x < y, 1, IF(x = y, 0, -1)))	[5, 3, 2, 2, 1]

cardinality(x) → bigint

Returns the cardinality (size) of the array x.
Example:

Query	Output
SELECT cardinality(ARRAY[1, 81, 92, 100, 12, 51, 10])	7

arrays_overlap(x, y) → boolean

Tests if arrays x and y have any any non-null elements in common. Returns null if there are no non-null elements in common but either array contains null.
Example:

Query	Output
SELECT arrays_overlap(ARRAY[1, 81, 92, 100, 12, 51, 10], ARRAY[101])	false
SELECT arrays_overlap(ARRAY[1, 81, 92, 100, 12, 51, 10], ARRAY[101, 51])	true

concat(array1, array2, …, arrayN) → array

Concatenates the arrays array1, array2, …, arrayN. This function provides the same functionality as the SQL-standard concatenation operator (||).

contains(x, element) → boolean

Returns true if the array x contains the element.
Example:

Query	Output
SELECT contains(ARRAY[1, 81, 92, 100, 12, 51, 10], 1)	true
SELECT contains(ARRAY[‘abcd’, ‘test’, ‘xyz’], ‘xyz’)	true

element_at(array(E), index) → E

SQL array indices start at 1
Returns element of array at given index. If index > 0, this function provides the same functionality as the SQL-standard subscript operator ([]). If index < 0, element_at accesses elements from the last to the first.
Example:

Query	Output
SELECT contains(ARRAY[1, 81, 92, 100, 12, 51, 10], 2)	81

filter(array(T), function(T, boolean)) → array(T)

Constructs an array from those elements of array for which function returns true.
Example:

Query	Output
SELECT filter(ARRAY [5, -6, NULL, 7], x -> x > 0)	[5, 7]

flatten(x) → array

Flattens an array(array(T)) to an array(T) by concatenating the contained arrays.

reduce(array(T), initialState S, inputFunction(S, T, S), outputFunction(S, R)) → R

Returns a single value reduced from array. inputFunction will be invoked for each element in array in order.
In addition to taking the element, inputFunction takes the current state, initially initialState, and returns the new state. outputFunction will be invoked to turn the final state into the result value.
It may be the identity function (i -> i).
Example:

Query	Output
SELECT reduce(ARRAY [5, 20, NULL, 50], 0, (s, x) -> s + COALESCE(x, 0), s -> s)	75
SELECT reduce(ARRAY [5, 20, NULL, 50], 1, (s, x) -> s * COALESCE(x, 1), s -> s)	5000

repeat(element, count) → array

Repeat element for count times

reverse(x) → array

Returns an array which has the reversed order of array x.
Example:

Query	Output
SELECT reverse(ARRAY[2, 5])	[5, 2]

sequence(start, stop) → array(bigint)

Generate a sequence of integers from start to stop, increments by 1 if start is less than or equal to stop, otherwise decrements by 1.
Example:

Query	Output
SELECT sequence(2, 7)	[2, 3, 4, 5, 6, 7]
SELECT sequence(7, 1)	[7, 6, 5, 4, 3, 2, 1]

sequence(start, stop, step) → array(bigint)

Generate a sequence of integers from start to stop, incrementing by step.

sequence(start, stop) → array(date)

Generate a sequence of dates from start date to stop date, incrementing by 1 day if start date is less than or equal to stop date, otherwise -1 day.

sequence(start, stop, step) → array(date)

Generate a sequence of dates from start to stop, incrementing by step. The type of step can be either INTERVAL DAY TO SECOND or INTERVAL YEAR TO MONTH.

shuffle(x) → array

Generate a random permutation of the given array x.
Example:

Query	Output
SELECT shuffle(ARRAY[1, 7, 2])	[7, 1, 2]

slice(x, start, length) → array

Subsets array x starting from index start (or starting from the end if start is negative) with a length of length.
Example:

Query	Output
SELECT slice(ARRAY[1, 7, 2, 87, 12, 09], 2, 4)	[7, 2, 87, 12]

transform(array(T), function(T, U)) → array(U)

Returns an array that is the result of applying function to each element of array
Example:

Query	Output
SELECT transform(ARRAY [5, NULL, 6], x -> COALESCE(x, 0) + 1)	[6, 1, 7]
SELECT transform(ARRAY [‘x’, ‘abc’, ‘z’], x -> x \|\| ‘0’)	[‘x0’, ‘abc0’, ‘z0’]

zip(array1, array2[, …]) → array(row)

Merges the given arrays, element-wise, into a single array of rows. The M-th element of the N-th argument will be the N-th field of the M-th output element. If the arguments have an uneven length, missing values are filled with NULL.
Example:

Query	Output
SELECT zip(ARRAY[1, 2], ARRAY[‘1b’, null, ‘3b’])	[ROW(1, ‘1b’), ROW(2, null), ROW(null, ‘3b’)]

zip_with(array(T), array(U), function(T, U, R)) → array(R)

Merges the two given arrays, element-wise, into a single array using function. If one array is shorter, nulls are appended at the end to match the length of the longer array, before applying function:
Example:

Query	Output
SELECT zip_with(ARRAY[1, 3, 5], ARRAY[‘a’, ‘b’, ‘c’], (x, y) -> (y, x))	[ROW(‘a’, 1), ROW(‘b’, 3), ROW(‘c’, 5)]

Window functions

Window functions perform calculations across rows of the query result. They run after the HAVING clause but before the ORDER BY clause. Invoking a window function requires special syntax using the OVER clause to specify the window. A window has three components:

The partition specification, which separates the input rows into different partitions. This is analogous to how the GROUP BY clause separates rows into different groups for aggregate functions.
The ordering specification, which determines the order in which input rows will be processed by the window function.
The window frame, which specifies a sliding window of rows to be processed by the function for a given row. If the frame is not specified, it defaults to RANGE UNBOUNDED PRECEDING, which is the same as RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW. This frame contains all rows from the start of the partition up to the last peer of the current row.

rank() → bigint

Returns the rank of a value in a group of values. The rank is one plus the number of rows preceding the row that are not peer with the row. Thus, tie values in the ordering will produce gaps in the sequence. The ranking is performed for each window partition.
As shown below, the rank function produces a numerical rank within the current row’s partition for each distinct ORDER BY value, in the order defined by the ORDER BY clause. rank needs no explicit parameter, because its behavior is entirely determined by the OVER clause.
Query:

WITH dataset AS 
    (SELECT 'name' AS rows, ARRAY['a', 'a', 'a', 'b', 'b', 'c', 'c', 'd', 'e'] AS sample )
SELECT rows,
         value,
         rank()
    OVER (ORDER BY value) AS rank
FROM dataset
CROSS JOIN UNNEST(sample) AS t(value)

Output:

	rows	value	rank
1	name	a	1
2	name	a	1
3	name	a	1
4	name	b	4
5	name	b	4
6	name	c	6
7	name	c	6
8	name	d	8
9	name	e	9

row_number() → bigint

returns a unique, sequential number for each row, starting with one, according to the ordering of rows within the window partition.
Query:

WITH dataset AS 
    (SELECT 'name' AS rows, ARRAY['a', 'a', 'a', 'b', 'b', 'c', 'c', 'd', 'e'] AS sample )
SELECT rows,
         value,
         row_number()
    OVER (ORDER BY value) AS row_num
FROM dataset
CROSS JOIN UNNEST(sample) AS t(value)

Output:

	rows	value	row_num
1	name	a	1
2	name	a	2
3	name	a	3
4	name	b	4
5	name	b	5
6	name	c	6
7	name	c	7
8	name	d	8
9	name	e	9

dense_rank() → bigint

returns the rank of a value in a group of values. This is similar to rank(), except that tie values do not produce gaps in the sequence.
Query:

WITH dataset AS 
    (SELECT 'name' AS rows, ARRAY['a', 'a', 'a', 'b', 'b', 'c', 'c', 'd', 'e'] AS sample )
SELECT rows,
         value,
         dense_rank()
    OVER (ORDER BY value) AS dense_ran
FROM dataset
CROSS JOIN UNNEST(sample) AS t(value)

Output:

	rows	value	dense_ran
1	name	a	1
2	name	a	1
3	name	a	1
4	name	b	2
5	name	b	2
6	name	c	3
7	name	c	3
8	name	d	4
9	name	e	5

Map Functions and Operators

map(array(K), array(V)) → map(K, V)

Returns a map created using the given key/value arrays.
Example:

Query	Output
SELECT map(ARRAY[1,3], ARRAY[2,4]) AS a	{1=2, 3=4}
SELECT map(ARRAY[‘k1′,’k2’], ARRAY[‘value1’, ‘value2’])	{k1=value1, k2=value2}

Subscript Operator: [ ]

The [ ] operator is used to retrieve the value corresponding to a given key from a map.
Example:

Query	Output
SELECT map(ARRAY[1,3], ARRAY[2,4]) AS a	{1=2, 3=4}
SELECT a[1] FROM (SELECT map(ARRAY[1,3], ARRAY[2,4]) AS a)	2

element_at(map(K, V), key) → V

Returns value for given key, or NULL if the key is not contained in the map.
Example:

Query

SELECT element_at(a, ‘k1’) FROM (SELECT map(ARRAY[‘k1′,’k2’], ARRAY[‘value1’, ‘value2’]) AS a)

cardinality(x) → bigint

Returns the cardinality (size) of the map x.

map() → map<unknown, unknown>

Returns an empty map.
Example:

Query	Output
SELECT map()	{}

map_from_entries(array(row(K, V))) → map(K, V)

Returns a map created from the given array of entries.
Example:

Query	Output
SELECT map_from_entries(ARRAY[(1, ‘x’), (2, ‘y’)])	{1 -> ‘x’, 2 -> ‘y’}

map_agg(key, value) → map(K, V)

Returns a map created from the input key/value pairs.

multimap_from_entries(array(row(K, V))) → map(K, array(V))

Returns a multimap created from the given array of entries. Each key can be associated with multiple values.
Example:

Query	Output
SELECT multimap_from_entries(ARRAY[(1, ‘x’), (2, ‘y’), (1, ‘z’)])	{1 -> [‘x’, ‘z’], 2 -> [‘y’]}

map_concat(map1(K, V), map2(K, V), …, mapN(K, V)) → map(K, V)

Returns the union of all the given maps. If a key is found in multiple given maps, that key’s value in the resulting map comes from the last one of those maps.

map_filter(map(K, V), function(K, V, boolean)) → map(K, V)

Constructs a map from those entries of map for which function returns true.
Example:

Query	Output
SELECT map_filter(MAP(ARRAY[], ARRAY[]), (k, v) -> true)	{}
SELECT map_filter(MAP(ARRAY[10, 20, 30], ARRAY[‘a’, NULL, ‘c’]), (k, v) -> v IS NOT NULL)	{10 -> a, 30 -> c}

map_keys(x(K, V)) → array(K)

Returns all the keys in the map x.
Example:

Query	Output
SELECT map_keys(map(ARRAY[‘k1′,’k2’], ARRAY[‘value1’, ‘value2’]))	[k1, k2]

map_values(x(K, V)) → array(V)

Returns all the values in the map x.
Example:

Query	Output
SELECT map_values(map(ARRAY[‘k1′,’k2’], ARRAY[‘value1’, ‘value2’]))	[value1, value2]

transform_keys(map(K1, V), function(K1, V, K2)) → map(K2, V)

Returns a map that applies function to each entry of map and transforms the keys.
Example:

Query	Output
SELECT transform_keys(MAP(ARRAY[], ARRAY[]), (k, v) -> k + 1)	{}
SELECT transform_keys(MAP(ARRAY [1, 2, 3], ARRAY [‘a’, ‘b’, ‘c’]), (k, v) -> k + 1)	{2 -> a, 3 -> b, 4 -> c}
SELECT transform_keys(MAP(ARRAY [‘a’, ‘b’], ARRAY [1, 2]), (k, v) -> k \|\| CAST(v as VARCHAR))	{a1 -> 1, b2 -> 2}

transform_values(map(K, V1), function(K, V1, V2)) → map(K, V2)

Returns a map that applies function to each entry of map and transforms the values.
Example:

Query	Output
SELECT transform_values(MAP(ARRAY[], ARRAY[]), (k, v) -> v + 1)	{}
SELECT transform_values(MAP(ARRAY [1, 2, 3], ARRAY [10, 20, 30]), (k, v) -> v + k)	{1 -> 11, 2 -> 22, 3 -> 33}

Thanks for the read. Please feel free to reach out with your comments.

This story is authored by PV Subbareddy. Subbareddy is a Big Data Engineer specializing on Cloud Big Data Services and Apache Spark Ecosystem.

Advanced Analytics – Presto Functions and Operators Quick Review