Just recently I wrote about why you might want to consider using a NoSQL database like MongoDB over the more traditional relational model of SQL Server. You can read that article right over here.
The main reason that I wrote that article was because I have been an avid user of SQL Server for almost 15 years now. It's what powers most of my websites and what most companies that I've worked for have used as well. But, as with most things in life, there are some challenges that come with using it, mainly:
- Difficult to maintain schema
- Database size limitations
- Difficult to scale
- More expensive
- Much more...
There is a time and place for using a relational database, don't get me wrong. But after 7 years of running this blog, I am starting to consider alternatives that might make more sense for me personally.
This is not an easy process and I do not recommend that you make a total database transition without putting in the research and without having a roadmap to rely on.
Having said that, here is a brief look at my current plan to migrate over to a NoSQL database in the next couple of months. Starting with the simplest.
Before you start, it's important to know exactly what your server requirements are and just how MongoDB is configured to run.
For me personally, I find that the lowest tier cluster on MongoDB is adequate enough for this particular blog. And I base that on the fact that the free-tier comes with 500MB's of FREE storage, which based on my estimates is still around several times more than I require currently.
All of this is managed through MongoDB's cloud hosting environment Atlas, which takes away much of the extra work required to set and maintain a database server. And if I ever need to expand my storage capacities, because Atlas is a cloud hosting environment, I can slowly scale outwards to meet the needs.
And that's essentially it for configuration and also one of the reasons why it was at the top of my list as far as database alternatives are concerned.
This is a time-consuming and challenging part. Not because it is technically difficult, but mainly because transferring millions of records into a completely different format is going to cause problems somewhere in the chain.
The big challenge being that data that is currently in a table -> row -> column configuration will first need to be transformed into a JSON document style object first. For this, I will probably need to create custom data migration scripts that will convert each row in SQL Server to its appropriate format in MongoDB. Once I have a generic script that is stable, I can essentially run it on every table that I am migrating over.
Because MongoDB handles relationships differently than SQL Server though, I will need to take some precautions when merging multiple table records into a single record. At least, that is the assumption for now.
The actual migration to MongoDB should not be challenging as far as I know. Aside from the time that it takes to import the data, I don't see any other roadblock just yet.
While the data is transferring, I can work on the next part in the equation, which is probably the most time-consuming and error-prone step.
Rewriting the DAL (Data access layer)
The DAL is essentially the middleware that reads from the database (SQL Server), cleans and packages the data into objects and finally sends it over to the client to render. Currently, that DAL is configured to read from relational databases only. This includes SQL Server, MySQL and PostgreSQL.
I'll say now, that I do not have any code written just yet that can manage reading and writing to a MongoDB cluster. This, I will have to spend the better part of a few weeks working on this, as it is the most important step and the most sensitive.
My plan here is to not completely get rid of the old layer however, but to slowly integrate the new MongoDB layer on top of what I currently have. This might mean that some data is coming from SQL Server for a while, while other data is coming from MongoDB. At least, temporarily. This will both give me time to polish up the new code, but also to measure the performance of the new database, as I have absolutely no clue just yet how it is going to react.
Sign up to the newsletter to stay updated on that entire process as it should be a good learning experience for myself and for anyone wanting to follow along.
The goal in this step is also to create a new MongoDB library that I can use on other projects as well and to compile it into a DLL for future use. This also takes time to architect out and to test, but it is definitely worth the extra time that it will take.
The last step is of course to test everything thoroughly. This means everything from testing performance and read/write speeds to testing my SEO to ensure that the TTFB (time-to-first-byte) is still adequate so that search engines don't penalize me in any way.
If everything goes smoothly and as planned, then you should technically see absolutely no difference in site performance, but if it improves, then that is an extra win. But my overall goal is not to improve performance. It is to future-proof this blog. That means that if in 2021 my traffic increases to 1 million monthly visitors, that I won't have to even think about how I am going to scale that out technically in the database.
That I can simply hit a switch and double my storage space or RAM without any kind of downtime.
This is the plan anyway. Do check back often as I plan on documenting this whole process and letting everyone know just what the challenges are and what I am learning as I dive into this new chapter of the blog.