Sync Data From S3

This is an end-to-end guide about how to move files from your AWS S3 to iomete and show it in the BI dashboard

Intro

This is an end-to-end guide about how to move files from your AWS S3 to iomete and show it in the BI dashboard. In this example, we use a JSON file, but for other file types (such as CSV, Parquet, and ORC) please check out: https://docs.iomete.com/external-data-sources/reading-raw-files

Your files in AWS S3

Let's say you have a dedicated bucket where you have files you want to move to iomete

Note: This bucket will be different in your case. This is just an example bucket for demonstration purpose

We want to query/migrate countries.jsonfile in iomete platform

The file (countries.json) we want to move to iomete

You could download  thecountries.json file for yourself with this command:

wget https://iomete-public.s3.eu-central-1.amazonaws.com/datasets/countries.json​

Create a storage integration in iomete

Choose AWS External Storage

2. Specify a name and enter your AWS S3 Location to create integration between to

Create AWS External Stage Storage Integration

3. Once it is created copy policies created to be added to your S3 Bucket permissions

4. Go to your AWS S3 Bucket and add generated JSON policy to your S3 Bucket's Permission

Create warehouse

Create a new warehouse instance and specify the storage integration you created in the previous step.


Moving Data

In the SQL Editor, you should be able to query the file and migrate to iomete using the following methods

Querying JSON file data without moving to iomete

Once you decided that you want to move data to iomete you could use the following commands

Option 1. Create a table from select

CREATE TABLE countries USING delta
   AS SELECT * FROM json.`s3a://my-staging-area-for-iomete/countries.json`​
Create a table directly from the query result

Option 2. Insert into to existing table

-- just append data
INSERT INTO countries
   SELECT  * FROM json.`s3a://my-staging-area-for-iomete/countries.json`

-- or you can use the follwing command to overwerite data
INSERT OVERWRITE TABLE countries
   SELECT  * FROM json.`s3a://my-staging-area-for-iomete/countries.json`
Insert to the existing table

Option 3. Merge with existing data

MERGE INTO countries
USING (SELECT  * FROM json.`s3a://my-staging-area-for-iomete/countries.json`) updates
ON countries.id = updates.id
WHEN MATCHED THEN
  UPDATE SET *
WHEN NOT MATCHED
  THEN INSERT *

Visualize Data

First, let's create a view with clean column names to be used in BI dashboarding:

CREATE OR REPLACE VIEW countries_view 
  AS SELECT 
      country_code, 
      region, 
      `SP.POP.TOTL` AS sp_pop_totl 
     FROM countries;

Open BI Application

Add new database connection. Select Data -> Databases from the menu

Choose Database Type. Here you need to choose Apache Hive from the dropdown:

Replace iomete_username and warehouse_name with your values

hive://<iomete_username>:XXXXXXXXXX@<warehouse_name>-warehouse-thriftserver:10000/?auth=CUSTOM&transport_mode=http

Add new dataset

From the menu choose Data -> Dataset and click + Dataset button on the right top corner

Create a new chart

Click on the newly created dataset countries_view which opens chart view. Choose the visualization type and corresponding settings:

Save this chart to the dashboard too and navigate to the dashboard. And, here is the dashboard of the Countries that we just created

Congratulations !

Subscribe to iomete

Don’t miss out on the latest issues. Sign up now to get access to the library of members-only issues.
jamie@example.com
Subscribe