July 14, 2023 • 5-minute read
My Continuous Localization SaaS Solution
How I developed a solution for continuous localization using Spring Boot, React.js and AWS
The Problem
Localization (l10n) is the process of translating and adapting software content to the language and culture of its audience. Global digital products present an additional challenge because their content must be localized to different regions of the world before each and every product release. This represents a huge pain point in the software development life cycle.
I've seen companies adopting the cascade approach for this process, waiting until the sprint was almost over to send out the content to translators. This means the testing, which at this point is almost done, should be entirely reworked after the translation is completed. In some instances QA people waited until all translation work was done before testing the new features, causing deadlines to blow up.
Another problem is the management of the translation jobs themselves. Unending email threads over terms and a confusion of exchanged file versions can send anyone crazy. So I decided to look up some tools proposing to mitigate this pain and found several products at different levels of maturity, but wanted to create one of my own.
Thinking Out a Solution
I started out by defining a translation workflow involving different roles. The product owner, with an admin role, would save the translation keys, which we'll call verses, at the start of the sprint, adding a context description for every verse, so that translators could start their work right away.
The state machine diagram below shows the states and transitions for a translation.
An additional developer role gives permissions to view the published translations and export them to files that could be added to the software project.
Admins can also invite members to their projects and assign them different roles.
Design
The project was defined to be a three-tiered, single-page application with the following stack:
Front End
- React.js with TypeScript
- Ant Design (design system with useful ready-made React components)
- Styled Components (CSS-in-JS)
- Vite (simple build tool to generate the static website)
Back End
- Spring Boot with Java 17
- Gradle for dependency management and building
- Spring Security for the management of authentication and authorization with different roles
- Flyway for schema management
- Junit 5 and Mockito for unit tests
- Testcontainers and MockMVC for integration tests
Data Layer
- MySQL database
Working as a solo dev on this project, I decided to write automated tests only for the back end. The project includes features such as:
- email verification upon registration,
- namespacing of verses in a project,
- back-end paginated views with sorting and filtering,
- email and in-app notifications on the invitations,
- statistical graphs, and
- translation role segregation by locale
Here are some screenshots of the application:
Infrastructure
The infrastucture is provisioned on AWS with the architecture described in the diagram below:
The front end application is deployed as a static website to an S3 bucket and distributed with CloudFront. Domain verxiom.com
had been registered with GoDaddy, so I needed to copy the relative records from a Hosted Zone in Route53 in order to point the domain to this infrastructure. Subdomains api.verxiom.com
and mail.verxiom.com
are used for the API endpoints and Amazon Simple Email Service (SES), respectively.
ECS Fargate manages the containerization of the API, pulling the Docker image from ECR. The EC2 bastion server allows an administrational connection to the production database from the external world via SSH tunneling.
DevOps
All infrastructure is generated with Terraform, except SES identities, secrets, and some IAM policies. As for CI/CD, AWS CodeBuild alone is used to deploy the front end, while AWS CodePipeline manages the sourcing, testing, build, and deploy of the API. Both are automatically triggered by a commit to the default branch in their respective repos on GitHub.
Due to the stinging costs to maintain the back-end infrastructure, I only terraform it occasionally for tests and demos. The cheaper front end, instead, is permanently live, with login and registration pages visible on https://verxiom.com
Security
Security deserves a subtitle of its own because minimizing the attack surface from design requires a lot of effort. Spring Security is an awesome tool in this regard but its flexibility comes at a price of complexity to handle. I'm considering the use of an authentication server in the future.
Application Security
The API has its endpoints secured with JWT authentication/authorization, and access control lists (ACLs) define who has access to what. Axios front-end library allows for a mechanism that makes the browser intercept the next 403-forbidden response and send a subsequent special request to refresh the token, keeping the user authenticated when the short-lived token expires. If, in the first place, the token is invalid, refresh and authentication fail.
Spring Security manages configuration for cross-origin resource sharing (CORS) and cross-site request forgery (CSRF), being the former set up to satisfy the browser's preflight request, and the latter less of a concern due to the use of JWT on the header of the requests -- hence disabled.
Cross-site scripting (XSS) is averted by the use of React.js, which automatically escapes string variables and uses functions as event handlers instead of strings that can contain malicious code.
All user passwords are hashed with salt before being stored in the database, minimizing the risks of dictionary attacks and password leaks. Password reset and email change workflows were also designed to avoid abuse while warrating availability to legitimate users.
Network Security
SSL certicates are managed by AWS Certificate Manager and attached both to the front-end website and to the API, so that all communication between the user's device and the application is encrypted. Most of the network security aspects are manageg by AWS, which enforces best practices.
DevOps Security
Secrets and passwords are kept away from the source code and stored in AWS Systems Manager Parameter Store or in environment variables. The master user of the MySQL database is kept only for administrational access and for the schema management tool, while the application is assigned a user restricted to SQL Data Manipulation Language (DML), in compliance with the least privilege principle. A future improvement would include static code analysis and dependency analysis in the CI/CD pipeline.
Beyond the Minimum Viable Product (MVP)
Future improvements for this project can include:
- OAuth 2.0 and social login employing a dedicated auth server with Keycloak or Amazon Cognito
- Import of projects
- Syncing of projects using JGit
- Distributed caching of notifications using Redis
- Invitations to unregistered users
- Task management with a Kanban board for translators
- Image upload for profile picture and, more importantly, translation context description
- Addition of observability using Spring Boot Actuator with tools such as Prometheus and Grafana
Conclusion
It took me about one year of working on my free time to get to this point of the solution, and my understanding of DevOps and infrastructure has improved greatly with the challenges faced along the way. I could put my Java and React skills into practice in a relaxed setting, weaving out solutions for entanglements such as password reset and email change workflows. Most of all, I enjoyed a lot working out a solution for a real-world problem.