The AI explorer uses GPT-4 to translate questions about your product into SQL queries that can be executed on our database. While the AI explorer is in alpha phase, it's possible that it will sometimes generate SQL queries that don't really work with our database schema.
When that happens, you can refer this guide to get a sense of our database schema and the types of queries you can run on it.
We use ClickHouse as our analytics database, and give you read access to 3 tables: events, all_user_traits, and group_traits. This means that you can run SELECT
queries on these tables and use ClickHouse functions depending on the data types of the columns you're working with. In most cases, you will use DateTime, String, and Integer functions.
Schema
events
This table contains all the TRACK and PAGE events that you've sent to us.
type | Integer |
name | String |
timestamp | DateTime |
group_id | String |
user_id | String |
properties | String |
context | String |
all_user_traits
This table contains all the users you've sent to us, along with all of their traits and values.
user_id | String |
key | String |
value | String |
timestamp | DateTime |
group_traits
This table contains all the companies you've sent to us, along with all of their traits and values, and all the users associated with them.
group_id | String |
user_id | String |
key | String |
value | String |
timestamp | DateTime |
Example queries
Count all track events
SELECT COUNT() FROM events WHERE type = 2
Count all page events
SELECT COUNT() FROM events WHERE type = 0
Count all distinct users
SELECT COUNT(DISTINCT user_id)
FROM all_user_traits
WHERE notEmpty(user_id)
Get all users who belong to a company
SELECT DISTINCT user_id
FROM group_traits
WHERE group_id = 'A'
Get all users where trait staff is true
SELECT
DISTINCT user_id
FROM (
SELECT
user_id,
key,
argMax(value, timestamp) AS _value
FROM all_user_traits
GROUP BY user_id, key
)
WHERE key = 'staff' AND _value = 'true'
Get all companies where trait is_paying is true
SELECT
DISTINCT group_id
FROM (
SELECT
group_id,
key,
argMax(value, timestamp) AS _value
FROM group_traits
GROUP BY group_id, key
)
WHERE key = 'is_paying' AND _value = 'true'
Count the events performed by non-staff users between a date range
SELECT
COUNT()
FROM events
WHERE timestamp BETWEEN '2023-01-01' AND '2023-02-01'
AND user_id IN (
SELECT
DISTINCT user_id
FROM (
SELECT
user_id,
key,
argMax(value, timestamp) AS _value
FROM all_user_traits
GROUP BY user_id, key
)
WHERE key = 'staff' AND _value != 'true'
)
Event property breakdown
SELECT properties_values[indexOf(properties_keys, '<your property name>')] AS <your property name>, COUNT(*) as count
FROM events
WHERE name = '<your event name>' AND has(properties_keys, '<your property name>')
GROUP BY <your property name>
ORDER BY count DESC
π΄ We do not share any user data with Open.ai. We just share the names of your events and the names of your traits π΄