Database Structure
Entity-Relationship Overview
-
Tickers
Stores stock symbols and related metadata. -
Posts
Contains posts or articles related to a ticker. This now also includes an optionalauthor
field. -
Points
Represents extracted investment thesis points from posts. Each point includes a sentiment score, the extracted text, and an embedding vector for semantic similarity, along with a flag indicating if any criticism exists. -
Criticisms
Contains valid criticisms linked to a specific point, optionally referencing the comment that inspired the criticism. -
Comments
Stores individual comments or feedback related to posts, which may also be referenced by criticisms.
Tables and Fields
1. Tickers
Stores information about each stock ticker.
-
Fields:
-
id
: Primary Key, Integer. -
symbol
: String (up to 10 characters), Unique, Not Null. -
name
: String (up to 100 characters). -
description
: Text – an AI-generated description based on financial data. -
overall_sentiment_score
: Integer, constrained with a check (“overall_sentiment_score BETWEEN 1 and 100”), Nullable. -
last_analyzed
: DateTime – timestamp of the last analysis. -
description_last_analyzed
: DateTime – timestamp of the last description update.
-
-
Relationships:
-
Posts: One-to-many relationship with Posts (cascade delete enabled).
-
Points: One-to-many relationship with Points (cascade delete enabled).
-
-
Additional Note:
- A custom
__repr__
method is defined for debugging and logging purposes.
- A custom
2. Posts
Contains posts or articles from various sources related to a ticker.
-
Fields:
-
id
: Primary Key, Integer. -
ticker_id
: Foreign Key referencingtickers.id
, Not Null. -
source
: String (up to 50 characters), Not Null – e.g., Reddit, Substack. -
title
: String (up to 250 characters), Nullable. -
author
: String (up to 100 characters), Nullable – represents the author of the post. -
link
: Text – URL to the original post. -
date_of_post
: DateTime – publication timestamp. -
content
: Text – full content of the post.
-
-
Relationships:
-
Ticker: Belongs to a single Ticker.
-
Points: One-to-many relationship with Points (cascade delete enabled).
-
Comments: One-to-many relationship with Comments (cascade delete enabled).
-
-
Additional Note:
- A custom
__repr__
method provides a concise string representation of each post.
- A custom
3. Points
Extracted investment thesis points from posts.
-
Fields:
-
id
: Primary Key, Integer. -
ticker_id
: Foreign Key referencingtickers.id
, Not Null. -
post_id
: Foreign Key referencingposts.id
, Not Null. -
sentiment_score
: Integer, with a check constraint (“sentiment_score BETWEEN 1 and 100”), Not Null. -
text
: Text, Not Null – the extracted point or thesis. -
criticism_exists
: Boolean, default set to False – indicates whether any valid criticism is linked. -
embedding
: ARRAY(Float) – stores a semantic embedding vector for similarity comparisons.
-
-
Relationships:
-
Ticker: Belongs to a Ticker.
-
Post: Linked to the originating Post.
-
Criticisms: One-to-many relationship with Criticisms (cascade delete enabled).
-
-
Additional Note:
- The custom
__repr__
method returns a brief representation of the point including its sentiment score and a preview of the text.
- The custom
4. Criticisms
Stores valid criticisms associated with specific points.
-
Fields:
-
id
: Primary Key, Integer. -
point_id
: Foreign Key referencingpoints.id
, Not Null. -
comment_id
: Foreign Key referencingcomments.id
, Nullable – if the criticism is linked to a particular comment. -
text
: Text, Not Null – the content of the criticism. -
date_posted
: DateTime – timestamp indicating when the criticism was recorded. -
validity_score
: Integer, with a check constraint (“validity_score BETWEEN 1 and 100”), Nullable – quantifies the strength or validity of the criticism.
-
-
Relationships:
-
Point: Belongs to a specific Point.
-
Comment: Optionally linked to a Comment from which the criticism originated.
-
-
Additional Note:
- A custom
__repr__
method provides a concise summary of the criticism.
- A custom
5. Comments
Contains individual comments or feedback linked to posts.
-
Fields:
-
id
: Primary Key, Integer. -
post_id
: Foreign Key referencingposts.id
, Not Null. -
content
: Text, Not Null – the comment’s text. -
link
: Text – optional URL or reference. -
author
: String (up to 100 characters), Nullable – the name of the comment’s author.
-
-
Relationships:
-
Post: Belongs to a specific Post.
-
Criticisms: One-to-many relationship with Criticisms (cascade delete enabled), allowing a comment to be referenced by one or more criticisms.
-
-
Additional Note:
- A custom
__repr__
method is defined for ease of debugging and logging.
- A custom
Sample Data Entries
Tickers
id | symbol | name | description | overall_sentiment_score | last_analyzed | description_last_analyzed |
---|---|---|---|---|---|---|
1 | AAPL | Apple Inc. | … | 85 | 2023-10-15 14:35 | 2023-10-10 14:40 |
2 | TSLA | Tesla, Inc. | … | 70 | 2023-10-15 14:35 | 2023-10-10 12:30 |
Posts
id | ticker_id | source | title | author | link | date_of_post | content |
---|---|---|---|---|---|---|---|
1 | 1 | Apple’s Q4 Earnings | AnalystA | http://reddit.com/post1 | 2023-10-15 14:35 | We should all invest in Apple… | |
2 | 2 | Substack | Tesla’s New Model | AnalystB | http://substack.com/p2 | 2023-10-15 14:35 | I think TSLA is overvalued… |
Points
id | ticker_id | post_id | sentiment_score | text | criticism_exists | embedding |
---|---|---|---|---|---|---|
1 | 1 | 1 | 85 | ”Strong sales in Q4” | False | [0.134, 0.298, -0.076, 0.415, 0.092, …] |
2 | 2 | 2 | 32 | ”Concerns over production” | True | [-0.213, 0.412, 0.103, -0.349, 0.287, …] |
Criticisms
id | point_id | comment_id | text | date_posted | validity_score |
---|---|---|---|---|---|
1 | 2 | 5 | ”Production issues are temporary” | 2023-10-14 12:00 | 80 |
Comments
id | post_id | content | link | author |
---|---|---|---|---|
5 | 2 | ”Industry reports suggest that production issues are only short-term.” | http://example.com/comment5 | AnalystX |