Seeking Alpha Scraper
SeekingAlpha doesn’t have an official API so I’m using an unofficial one from rapidApi. (free account has a rate limit of 500 requests per day) - as of 07.04.2025
- first get a list of all posts/analysis belonging to the given ticker. This returns a list of post ids
- for every post then get the details:
post_info = {
"source": "seekingalpha",
"title": title,
"author": author_name,
"time_of_post": time_of_post_formatted,
"url": absolute_url,
"content": filtered_content,
"comments": comments
}
To get the comments 2 additional requests have to be ran:
- The first request to get a list of all comment ids
- A second request to get the content of each comment by passing the list of ids
I initially had to request the content of each comment separately since the API wasn’t clear on that. This was incredibly inefficient so I tinkered around with the API and figured out that comment_ids
is allowed to be a list. This means it’s allowed to get the contents of all comments at once.
This leaves us with the following formula to calculate the amount of requests needed:
1 + num_posts * (1 + 1)
- the first 1 is to get the post list
- the second 1 is to get the list of all comments
- and then the third one is to get the contents of each comment
Lastly the post content and the comment contents are cleaned up by removing script & style elements, removing URLs and whitespace and unescaping HTML entities.