Explore stock market data

When you’ve successfully collected 1-min intraday stock data, it’s time to have some fun and explore the data.

Because of the high granularity of the dataset, there are numerous ways to explore it. For example, you could analyze stock prices and volumes on a minute-by-minute basis. With TimescaleDB, you could also bucket records into custom intervals (for example, 2 min or 15 min) using TimescaleDB aggregate functions.

Let’s see how it’s done!

Install Plotly and Pandas

To get started with data exploration, you need to install a couple of tools first:

  • Pandas, to query and structure the data (this is already installed if you have completed the steps in the previous sections)
  • Plotly, to create visualizations quickly

Install both

  1. pip install plotly pandas

When you have those installed, you need to open a new Python file, or use a Jupyter notebook to start exploring the dataset.

Establish database connection

Use the configuration file you created earlier with psycopg2 to create a database connection object:

  1. import config, psycopg2
  2. conn = psycopg2.connect(database=config.DB_NAME,
  3. host=config.DB_HOST,
  4. user=config.DB_USER,
  5. password=config.DB_PASS,
  6. port=config.DB_PORT)

In each data exploration script, you need to reference this connection object to be able to query the database.

Generate stock market insights

Let’s start off analyzing trading volumes, then have a look at weekly price points, and finally, dig deep on price changes. The results of the queries shown are visualized using Plotly.

tip

Let these queries serve as inspiration to you, and feel free to change things up, like the analyzed bucket, the symbol or other parts of the query. Have fun!

  1. Which symbols have the highest transaction volumes?
  2. How did Apple’s trading volume change over time?
  3. How did Apple’s stock price change over time?
  4. Which symbols had the highest weekly gains?
  5. Weekly FAANG prices over time?
  6. Weekly price changes of Apple, Facebook, Google?
  7. Distribution of daily price changes of Amazon and Zoom
  8. Apple 15-min candlestick chart

1. Which symbols have the highest transaction volumes?

Let’s generate a bar chart that shows the most traded symbols in the last 14 days:

  1. import plotly.express as px
  2. import pandas as pd
  3. query = """
  4. SELECT symbol, sum(trading_volume) AS volume
  5. FROM stocks_intraday
  6. WHERE time > now() - INTERVAL '{bucket}'
  7. GROUP BY symbol
  8. ORDER BY volume DESC
  9. LIMIT 5
  10. """.format(bucket="14 day")
  11. df = pd.read_sql(query, conn)
  12. fig = px.bar(df, x='symbol', y='volume', title="Most traded symbols in the last 14 days")
  13. fig.show()

most traded symbols

2. How did Apple’s trading volume change over time?

Now let’s try a similar query focused on the daily trading volume of one symbol (for example, ‘AAPL’).

  1. import plotly.express as px
  2. import pandas as pd
  3. query = """
  4. SELECT time_bucket('{bucket}', time) AS bucket, sum(trading_volume) AS volume
  5. FROM stocks_intraday
  6. WHERE symbol = '{symbol}'
  7. GROUP BY bucket
  8. ORDER BY bucket
  9. """.format(bucket="1 day", symbol="AAPL")
  10. df = pd.read_sql(query, conn)
  11. fig = px.line(df, x='bucket', y='volume', title="Apple's daily trading volume over time")
  12. fig.show()

apple trading volume over time

3. How did Apple’s stock price change over time?

This query returns the weekly stock price of Apple over time:

  1. import plotly.express as px
  2. import pandas as pd
  3. query = """
  4. SELECT time_bucket('{bucket}', time) AS bucket,
  5. last(price_close, time) AS last_closing_price
  6. FROM stocks_intraday
  7. WHERE symbol = '{symbol}'
  8. GROUP BY bucket
  9. ORDER BY bucket
  10. """.format(bucket="7 days", symbol="AAPL")
  11. df = pd.read_sql(query, conn)
  12. fig = px.line(df, x='bucket', y='last_closing_price')
  13. fig.show()

apple price over time

4. Which symbols had the highest weekly gains?

Now generate a table containing the symbols with the biggest weekly gains:

  1. import plotly.express as px
  2. import pandas as pd
  3. query = """
  4. SELECT symbol, bucket, max((closing_price-opening_price)/closing_price*100) AS price_change_pct
  5. FROM (
  6. SELECT
  7. symbol,
  8. time_bucket('{bucket}', time) AS bucket,
  9. first(price_open, time) AS opening_price,
  10. last(price_close, time) AS closing_price
  11. FROM stocks_intraday
  12. GROUP BY bucket, symbol
  13. ) s
  14. GROUP BY symbol, s.bucket
  15. ORDER BY price_change_pct {orderby}
  16. LIMIT 5
  17. """.format(bucket="7 days", orderby="DESC")
  18. df = pd.read_sql(query, conn)
  19. print(df)
symbolbucketprice_change_pct
ZM2021-06-0724.586495
TSLA2021-01-0418.280314
BA2021-03-0817.745225
SNAP2021-02-0116.149649
TSLA2021-03-0815.842941

price_change_pct shows the price change that happened between the start and end of the week.

bucket shows (the first day of) the week.

tip

Change orderby to “ASC” to query the biggest losses.

5. Weekly FAANG prices over time?

Let’s see a line chart with the FAANG (Facebook, Apple, Amazon, Netflix, Google/Alphabet) weekly stock prices:

  1. import plotly.express as px
  2. import pandas as pd
  3. query = """
  4. SELECT symbol, time_bucket('{bucket}', time) AS bucket,
  5. last(price_close, time) AS last_closing_price
  6. FROM stocks_intraday
  7. WHERE symbol in {symbols}
  8. GROUP BY bucket, symbol
  9. ORDER BY bucket
  10. """.format(bucket="7 days", symbols="('FB', 'AAPL', 'AMZN', 'NFLX', 'GOOG')")
  11. df = pd.read_sql(query, conn)
  12. fig = px.line(df, x='bucket', y='last_closing_price', color='symbol', title="FAANG prices over time")
  13. fig.show()

faang prices

6. Weekly price changes of Apple, Facebook, Google?

Analyzing the price points directly can be useful when you are looking at one specific symbol, but if you want to compare different stocks, it might be better to look at price changes instead. Let’s compare the price changes of Apple, Facebook, and Google:

  1. import plotly.express as px
  2. import pandas as pd
  3. query = """
  4. SELECT symbol, bucket, max((closing_price-opening_price)/closing_price) AS price_change_pct
  5. FROM (
  6. SELECT
  7. symbol,
  8. time_bucket('{bucket}}', time) AS bucket,
  9. first(price_open, time) AS opening_price,
  10. last(price_close, time) AS closing_price
  11. FROM stocks_intraday
  12. WHERE symbol IN {symbols}
  13. GROUP BY bucket, symbol
  14. ) s
  15. GROUP BY symbol, s.bucket
  16. ORDER BY bucket
  17. """.format(bucket="7 days", symbols="('AAPL', 'FB', 'GOOG')")
  18. df = pd.read_sql(query, conn)
  19. figure = px.line(df, x="bucket", y="price_change_pct", color="symbol", title="Apple, Facebook, Google weekly price changes")
  20. figure = figure.update_layout(yaxis={'tickformat': '.2%'})
  21. figure.show()

weekly price changes

7. Distribution of daily price changes of Amazon and Zoom

Now let’s generate a scatter chart to look at the distribution of daily price changes of Amazon and Zoom. Analyzing this data enables you to better understand the volatility of individual stocks and how they compare to each other.

  1. import plotly.express as px
  2. import pandas as pd
  3. query = """
  4. SELECT symbol, bucket, max((closing_price-opening_price)/closing_price) AS price_change_pct
  5. FROM (
  6. SELECT
  7. symbol,
  8. time_bucket('{bucket}', time) AS bucket,
  9. first(price_open, time) AS opening_price,
  10. last(price_close, time) AS closing_price
  11. FROM stocks_intraday
  12. WHERE symbol IN {symbols}
  13. GROUP BY bucket, symbol
  14. ) s
  15. GROUP BY symbol, s.bucket
  16. ORDER BY bucket
  17. """.format(bucket="1 day", symbols="('ZM', 'AMZN')")
  18. df = pd.read_sql(query, conn)
  19. figure = px.scatter(df, x="price_change_pct", color="symbol", title="Distribution of daily price changes (Amazon, Zoom)")
  20. figure = figure.update_layout(xaxis={'tickformat': '.2%'})
  21. figure.show()

distribution of price changes

8. Apple 15-min candlestick chart

Finally, because this is a tutorial about stocks, let’s generate a 15-min candlestick chart for Apple:

For candlestick charts, you need to import Plotly’s graph_object module.

  1. import pandas as pd
  2. import plotly.graph_objects as go
  3. query = """
  4. SELECT time_bucket('{bucket}', time) AS bucket,
  5. FIRST(price_open, time) AS price_open,
  6. LAST(price_close, time) AS price_close,
  7. MAX(price_high) AS price_high,
  8. MIN(price_low) AS price_low
  9. FROM stocks_intraday
  10. WHERE symbol = '{symbol}' AND date(time) = date('{date}')
  11. GROUP BY bucket
  12. """.format(bucket="15 min", symbol="AAPL", date="2021-06-09")
  13. df = pd.read_sql(query, conn)
  14. figure = go.Figure(data=[go.Candlestick(x=df['bucket'],
  15. open=df['price_open'],
  16. high=df['price_high'],
  17. low=df['price_low'],
  18. close=df['price_close'],)])
  19. figure.update_layout(title="15-min candlestick chart of Apple, 2021-06-09")
  20. figure.show()
tip

Change date to see the candlesticks for another day.

candlestick chart apple

Resources