ANI

From JSON to Dashboard: Seeing DuckdB Questions in Streamlit with Plotly

From JSON to Dashboard: Seeing DuckdB Questions in Streamlit with Plotly
Photo for Editor | Chatgt

Obvious Introduction

Data is the most important app for the company, and information from the details can make the difference between interest and failure. However, raw data is difficult to understand, so we see it in the dashboards so that non-skilled people can find it better.

Building the Dashboard is not correct, especially when working with JSON's data. Fortunately, many Python's libraries can be integrated to create a useful tool.

In this article, we will learn how to build a dashboard using the direction of heart and beneficiary to visualize DuckdB questions from the JSON file.

Want to know? Let's get into it.

Obvious Dashboard Development

Before getting our dashboard, let's learn a little about the tools we will use.

First, JSON, or javascript is an ACation item, is a format based on the last text and transfer data using key value and arrays. It is a very common Appeal for API and the data interchange between programs.

The next, DuckdB Is the RDBMS of an open source (Database Relionase Management) for the load of analytical work. It is a way of analyzing the Internet analysis service (OLAP) that works directly on the Python process without the need to manage a separate server. It is also designed to be done quickly, ready to analyze data with large datasets.

Support often used for dashboard development. It is an open source of source of the construction of applicable Internet applications that use Python. To develop dashboard, we do not have to understand HTML, CSS, or JavaScript.

We will also use Pings to the headThe powerful library to deceive data and Pythonal analysis.

Last, Seldom Is the open library to improve active graphs and charts. It can be associated with Dashboard development libraries such as planning.

That is the basic meaning of tools we will use. Let's start to improve our Json Dashboard. We will use the following structure, so try to create next.

JSON_Dashboard/
├── data/
│   └── sample.json
├── app.py
└── requirements.txt

Next, let's fill files with all the necessary information. First, let's have our JSON data as your JSON like that below. You can always use your data, but here is an example to use.

[
  {"id": 1, "category": "Electronics", "region": "North", "sales": 100, "profit": 23.5, "date": "2024-01-15"},
  {"id": 2, "category": "Furniture", "region": "South", "sales": 150, "profit": 45.0, "date": "2024-01-18"},
  {"id": 3, "category": "Electronics", "region": "East", "sales": 70, "profit": 12.3, "date": "2024-01-20"},
  {"id": 4, "category": "Clothing", "region": "West", "sales": 220, "profit": 67.8, "date": "2024-01-25"},
  {"id": 5, "category": "Furniture", "region": "North", "sales": 130, "profit": 38.0, "date": "2024-02-01"},
  {"id": 6, "category": "Clothing", "region": "South", "sales": 180, "profit": 55.2, "date": "2024-02-05"},
  {"id": 7, "category": "Electronics", "region": "West", "sales": 90, "profit": 19.8, "date": "2024-02-10"},
  {"id": 8, "category": "Furniture", "region": "East", "sales": 160, "profit": 47.1, "date": "2024-02-12"},
  {"id": 9, "category": "Clothing", "region": "North", "sales": 200, "profit": 62.5, "date": "2024-02-15"},
  {"id": 10, "category": "Electronics", "region": "South", "sales": 110, "profit": 30.0, "date": "2024-02-20"}
]

Next, we will complete requirements.txt File with libraries to use for our dashboard development.

streamlit
duckdb
pandas
plotly

Then, run the following code to include the required libraries. It is recommended to use visible environment when you set an environment.

pip install -r requirements.txt

When everything is ready, we will improve our dashboard. We will check the application code by step-by step so you can follow logic.

Let us first make the libraries needed for our dashboard.

import streamlit as st
import duckdb
import pandas as pd
import plotly.express as px

Next, we will set the connection we need to DuckdB.

@st.cache_resource
def get_conn():
    return duckdb.connect()

The code above is not going to call to point to DUCKD connection

After that, we prepare the code to read JSON data using the following code.

@st.cache_data
def load_data(path):
    df = pd.read_json(path, convert_dates=["date"])
    return df

In the above code, we convert the JSON file into Panda DataFrame and cache information so we don't need to read it again when the filter changes.

After data upload and connectivity are ready, we will connect to DuckdB to keep JSON data. You can always change the data area and table name.

conn = get_conn()
df_full = load_data("data/sample.json")
conn.execute("CREATE OR REPLACE TABLE sales AS SELECT * FROM df_full")

In the above code, we register DataFrame As a SQL table called Name sales inside the duckdb. The table will be updated in memories of all the Rerun, as we can set insistence on a different script.

That's all that is for back; Let's get ready for a streamlit dashboard. First, let's prepare a Dashboard title and side filter.

st.title("From JSON to Dashboard: DuckDB SQL Visualizer")

st.sidebar.header("Filter Options")
category = st.sidebar.multiselect("Select Category:", df_full['category'].unique())
region = st.sidebar.multiselect("Select Region:", df_full['region'].unique())
date_range = st.sidebar.date_input("Select Date Range:", [df_full['date'].min(), df_full['date'].max()])

The upper -ide bar will be a powerful filter of the data, where we can change the SQL question based on these filters.

We then make up the SQL question in accordance with the following code.

query = "SELECT * FROM sales WHERE TRUE"
if category:
    query += f" AND category IN {tuple(category)}"
if region:
    query += f" AND region IN {tuple(region)}"
query += f" AND date BETWEEN '{date_range[0]}' AND '{date_range[1]}'"

The above question is only formed based on the user's selection. We start with WHERE TRUE the situation of simplify to enter additional filters with AND.

With question generation is ready, we will show a question and the following code.

st.subheader("Generated SQL Query")
st.code(query, language="sql")

df = conn.execute(query).df()
st.subheader(f"Query Results: {len(df)} rows")
st.dataframe(df)

The above code shows the SQL question used to return data from DuckdB and turn the result into pandas DataFrame to show a subtle table.

Finally, we will prepare for visualization by making the filtered things.

if not df.empty:
    col1, col2 = st.columns(2)

    with col1:
        st.markdown("### Scatter Plot: Sales vs Profit by Region")
        scatter_fig = px.scatter(df, x="sales", y="profit", color="region", hover_data=["category", "date"])
        st.plotly_chart(scatter_fig, use_container_width=True)

    with col2:
        st.markdown("### Bar Chart: Total Sales by Category")
        bar_fig = px.bar(df.groupby("category", as_index=False)["sales"].sum(), x="category", y="sales", text_auto=True)
        st.plotly_chart(bar_fig, use_container_width=True)

    st.markdown("### Line Chart: Daily Sales Trend")
    line_fig = px.line(df.groupby("date", as_index=False)["sales"].sum(), x="date", y="sales")
    st.plotly_chart(line_fig, use_container_width=True)
else:
    st.warning("No data found for the selected filters.")

In the code above, we build three different sites: dispersion structure, bar chart and line chart. You can always turn off the chart type according to your needs.

With the entire code ready, we will use the following command to introduce our distribution dash.

Now you can reach the dashboard, it looks like the picture below.

Oview of a visible Deshetboard display with filter optionsOview of a visible Deshetboard display with filter options

The sites will look like a picture below.

Sticka Plot and Bar Chart Viso Notification in the SystemDlit DashboardSticka Plot and Bar Chart Viso Notification in the SystemDlit Dashboard

As visible views are used by accident, you can roar together, as shown in the chart below.

A active line chart showing the tendency to sell everyday in the streamlit dashboardA active line chart showing the tendency to sell everyday in the streamlit dashboard

That's all you need to know. You can always add serious hardship to the dashboard and make it even in your business.

Obvious Store

Data is a very important app for the company that can have, and Identify the Dashboard is how business people receive information. In this article, we have learned to improve a simple resolution dashboard and a genetically connecting data from JSON Files stored in DuckdB.

I hope this has helped!

Cornellius Yudha Wijaya It is a scientific science manager and the database author. While working full-time in Allianz Indonesia, she likes to share the python and data advice with social media and media writing. Cornellius writes to a variety of AI and a study machine.

Source link

Related Articles

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top button