Modernizing to cloud is an enterprise inevitability. Organizations need to scale faster to meet application traffic demands. They need to support a faster go-to-market strategy and switch from a capex to an opex model. Organizations that have embarked on this journey have largely realized that using a “lift and shift” strategy alone doesn’t provide the best return on investment. It is the need of the hour to adopt a better strategy to leverage available new age technologies as managed services across multiple public cloud providers.
Adopting such a strategy requires modernizing applications before cloud migration. Application modernization seeks to break down large monoliths into more agile microservices-based applications. When this happens, organizations need to rethink their data architecture. While microservices are primarily stateless, enterprise applications need to store and manage business data. In such architecture there’s a need for a modern database that:
- Allows developers to make frequent application changes without the need to constantly refine database schemas and worry about the structure of existing data.
- Caters to the scale of different microservices.
- Serves the high availability requirements of such an architecture.
- Provides support for different application workloads like web, mobile, serverless, distributed etc.
- Provides support for different use cases such as document, time series, geospatial, graph, search, etc.
- Has a variety of drivers and integrations available, making it easy to integrate with other components in a microservices architecture.
- Provide high standards of security and compliance.
- Can be more cost effective than traditional relational databases, helping to lower the costs of deploying and maintaining a microservices architecture.
The MongoDB Atlas developer data platform caters to all the above requirements and more so organizations can seamlessly modernize their data architecture as part of their application modernization initiatives.
When moving from a relational data model to a document-based model (in MongoDB,) it is imperative to understand the application workload requirements. A three-step strategy can help with this discovery:
- Understanding the relational database structure and mapping it to an optimal document model schema in MongoDB.
- Understanding the source code that is tightly coupled to the legacy database. Some items to consider include trigger, stored procedure, SQL function, and db-links. A loosely coupled architecture would map these items to API(s) and independent microservices.
- Understanding the overall application workload requirements, including batch and transactional requirements. This information could then be used in tandem with the information from the previous steps to design the apt MongoDB schema, required indexes and views, and a MongoDB architecture that can fulfill the application performance needs.
Many organizations have adopted the above strategy for modernizing their existing apps into microservices using MongoDB’s document-based data model. This approach has not only led to improved scalability and performance of the app but also enabled organizations to deliver new features and capabilities faster.
Understanding the existing relational database structure
There are many benefits to moving from a relational to a document-based model. MongoDB follows the philosophy of ‘data that is accessed together is stored together.’ This allows users to avoid unnecessary costly joins that normally plague legacy databases. At the same time, the document model is very intuitive for developers as it allows them to represent their data as objects, which is very similar to how they work with data in code.
Information needed to modernize a legacy data model:
- We need to start with the existing schema of the relational database. This includes the table structure, relationships amongst tables, and data constraints.
- Another important aspect to consider is the classification and nature of the data and the workload. As large legacy applications evolve over time, more often than not, there are parts of the application and the corresponding data that are no longer used. Understanding this ‘dead code’ and these ‘dead data’ tables will help simplify and optimize the target application.
CAST Imaging provides a powerful way to explore the relational data model. It provides a graphical view that is easy to understand and helps with establishing the current state of the application database.
Figure 1.1: Relational database map in CAST Imaging
In the relational database displayed above, a number of “natural” candidates to structure the data as a collection of documents in MongoDB is immediately visible. Furthermore, CAST Imaging provides contextual access to the source code of the schema as shown on the left. The “MORTGAGES” table has been selected and is linked with the “REPORTS” table via a foreign key and a constraint. Depending on the nature of the relationship (1:1, 1:Many, Many:Many) a corresponding candidate can be decided for the MongoDB data model. One such mapping could have a MongoDB Model that consists of a collection for MORTGAGE where each MORTGAGE is represented by a document with an array of subdocuments for REPORTS. Additionally, validation rules may be used to enforce schema and data validation.
Figure 1.2: Schema source code for database tables
Relational schemas will often be supplemented with views that are specifically built to present data of a normalized database into a more fit-for-purpose way. CAST Imaging is of great help not only to display the views but also to explore the code that uses them.
Figure 1.3: View of linked objects (database views)
The above screen capture shows how easy it is with CAST Imaging to discover views in a relational schema model. Here we have focused on the views related to the MORTGAGES table. When displaying the completed schema, it appears that MORTGAGES is not only linked to REPORTS but also to USERS. The way the relational schema was originally built, and the fact that a view linking USERS and MORTGAGES is necessary for the business application, will help better understand and design the corresponding MongoDB schema.
Figure 1.4: View of unused tables
Finally, CAST Imaging also provides a list of unused tables. This corresponds to the ‘dead code’ discussion that we had earlier. Identifying such dead tables will allow for further optimization of the data model in the new MongoDB database. Users can remove unnecessary complexity by eliminating these unused tables in the new database schema.
Understanding Triggers/Stored Proc(s)/SQL functions in order to modernize to an architecture based on API(s) and microservices
As applications evolve over long periods of time, there is a corresponding application code that evolves with it. In the case of legacy applications, since there is no concept of microservices, back-end code that implements business logic often ends up in triggers, SQL functions, and stored procedures. This code is tightly coupled to the legacy database, hence when the database goes through modernization, the business logic needs to find a new home. In most cases, this is via implementing API(s) and independent microservices.
When adopting MongoDB, it is necessary to understand the way the existing application is using these features. This is to not only ensure that the MongoDB schema is built optimally, but also to refactor the existing application, to be able to cater to all the needs of the business that were previously being handled by this backend code. Below CAST Imaging can be seen displaying a view of a relational schema including all the triggers, stored procedures, and SQL function associated with it:
Figure 1.5: Triggers, stored procedures, and SQL functions
This graphical user interface allows for easy exploration of the database schema and the associated triggers, stored procs, and functions. Additionally, CAST Imaging provides various export options that allow for the easy sharing of detailed information with developers and other team members to help with application development and refactoring:
Figure 1.6: Exporting details for developers
Understanding the overall application workflow to define an optimal MongoDB architecture and database schema
When working with business logic, it is imperative to understand it from a business as well as a technical standpoint. This is often achieved by interacting with business and operational teams. From a technical point of view, it is also important to understand the overall data flow. For example:
- What are the different channels through which users can interact with the application?
- How does that interaction translate into a transactional (query) workload?
- What are the busiest queries, tables, and views for the application?
- What is the frequency and impact of batch jobs associated with business logic?
- What is the frequency and impact of reporting and analytics requirements on the database?
CAST Imaging can render tremendous visibility in this area. Below is a detailed view of all the application transactions and batch processes end-to-end.
Figure 1.7: Application transaction view
Mapping application workflows allows architects to prioritize the application transactions and model the new MongoDB database and application queries to optimize for the most critical business functions. MongoDB’s native JSON supports maps naturally to the way developers think in code. Objects in code can be represented as objects in the database without the need for splitting them across multiple tables for the sake of normalization.
Data platform modernization is key to the success of any application modernization initiative. Legacy databases are not equipped to work with modern agile architectures in single or multi-cloud environments. The MongoDB flexible data model allows applications to adapt their data model according to the growing needs of the business without having to worry about the complex schema redesign or transformation of existing data. MongoDB Atlas brings this agility to the cloud allowing organizations to deploy scalable and secure MongoDB environments across multiple zones, regions, and cloud providers. In this transformation journey, CAST Imaging can play a key role in identifying the candidate applications for modernization and diving deeper into the current state of these modernization candidates. Create a free MongoDB Atlas account today and explore the platform for yourself.
Paresh Saraf leads ISV/Tech Partner Solutions Consulting team globally for MongoDB. He has been working with partners and customers across the globe to ease their journey from on-prem legacy systems to the cloud and help them realize the value of the MongoDB Data Platform. Before this Paresh had been working as an Architect/Lead in multiple organizations with core expertise in BigData, AI/ML, and search and microservices.
Nooruddin Abbas Ali is a Principal Partner Solutions Architect at MongoDB. He has been in the tech industry for 2 decades with extensive experience in automation, cloud, and distributed data systems. This combination of data and cloud skills, coupled with his experience at various levels of the enterprise spectrum allows him to help partners and customers on their journey to innovate in their respective fields and bring about industry disruption.
Sylvain CAILLIAU, Technical Director and technology evangelist at CAST, is a specialist in all ADM practices, Agile methods, and DevOps (over 25 years of experience). He supports IT departments in their transformation thanks to his in-depth knowledge of complex systems, his understanding of business issues, and the advanced capabilities of CAST Software Intelligence Solutions.