Error Handling

Now this is all pretty good but I now have to get to something which I’ve been putting off for a bit too long. And that’s error handling. Now I do have some basic error handling here and there as outlined before. However I need good error handling for the analysis specifically. Since the analysis takes a lot of moving parts and puts them together, there’s a lot that can go wrong. And when it does I (or the user) need to know what went wrong.

Luckily I already built a way to communicate the status of the analysis to the user by adding the /analysis-status endpoint and updating a dictionary with the current status of the analysis.

While looking at the /analysis-status endpoint I realized I implemented it in a pretty suboptimal way so I’ll first improve that. Currently the endpoint is blindly returning the value of the task_id key in TASKS. However since this information will be displayed directly to the user, this means if there’s any error or problem we might end up showing the user whatever output was saved to the task_id value without validating it. This is actually bad out of two reasons. First I of course don’t want to confuse the user with some strange output but just want to tell him in a simple way that something might have gone wrong. Second this is horrible out of a security perspective since this could lead to use revealing some server internal (possibly sensitive) output to the frontend.

Long story short I’ll modify this by instead returning a fixed dictionary of the information I want to display and only retrieve what I really need for each value and display it to the user.

This looks something like this:

return {

"status": task.get("status"),

"progress": task.get("progress"),

"error": task.get("error"),

"ticker": task.get("ticker")

}

Note that I use “get()” so the program doesn’t crash in case the values are missing.

This still doesn’t 100% fix the issue though. As I mentioned I don’t want to blindly return every error to the user. Therefore I can’t just raise everything to the TASKS storage. Instead what I’ll do is wrap the run_analysis.py flow in a try-except block. If there is any exception however it’ll update the error key to a standard error message telling the user at what stage the analysis failed (based on the status key) and to contact support. I’ll then use the logging library to log the error on the server. For that I set up a logging config in main.py specifying two logging files: errors.log & warnings.log. Errors are logged to both the errors.log and warnings.log file but warnings will only be logged to the latter. Errors are exceptions that break the main flow while warnings are exceptions that get skipped. I also specify the format in which errors should be logged.

Database error handling

Database error handling is of course a bit more tricky because I don’t want to leave sessions open when an error occurs or not save data or, even worse, save wrong data. What I’ll do here is add a centralized context manager which wraps the session and ensures a rollback if any error occurs. I’ll call this context manager session_scope and replace all mentions of SessionLocal with it. I’ll use the contextlib library to do that. This also makes the code a lot cleaner since I won’t have to manually close or commit every time because the context manager will take care of that.

There’s also the handling of database warnings. As I mentioned previously for some exceptions weIjust want to log warnings and continue. One example for this is when the app is going through a list of posts or points and there is an issue with one specific item. I don’t want the whole flow to break simply because an exception was caught in one post. Instead I’ll log the warning an continue. However with database operations because of the global context manager if any exceptions are caught it will automatically roll back the entire outer transaction including all other posts. Therefore I’ll use SQLAlchemy’s nested sessions feature. This feature is essentially an abstraction of the PostgreSQL savepoint feature which allows the application to set savepoints within a transaction and rollback part of a transaction without affecting the rest. Using that I can simply roll back the part of the transaction that failed (e.g. the specific post that raised the exception).