It is perhaps no secret that as companies grow, specialise, and focus on their core profit areas, they evolve working practices and policies to improve their efficiency. Unfortunately, this generally comes at the, usually invisible, cost of losing the ability to innovate.
With over twenty years’ experience of working with some of the largest and wealthiest organisations in the world, we have also found the ability to design simple effective software architecture is something that is not prevalent.
HMx Labs has consistently solved the lack of innovation and poor design practices prevalent in large organisations through partnerships in the field that provide directed innovative solutions. We have a track record of working with enterprises of all scales to help them break out of the box and deliver simple yet effective solutions.
Our expertise lies primarily in the areas of high-performance computing and cloud-based systems.
We have worked with HPC solutions from 5,000 to 100,000 cores, primarily for the financial services sector, running risk calculations. This has included regulatory based extensions to systems and optimising utilisation to reduce run rates.
The current trend to cloud computing has revealed this to be an area that enterprises with established internal processes and infrastructure struggle with. Our strategies and architectural solutions allow our clients to transition from on-premises monolithic applications to cloud native or hybrid systems. Again, this work has been primarily in the financial services sector and focused on risk and enterprise data systems.
The work we do is habitually covered by non-disclosure agreements (NDAs) and we are not at liberty to divulge the details of client engagements. However, the case studies below aim to provide some insight into our previous appointments.
Over the last two years we have been helping a large tier one investment bank architect their enterprise data management platform. The system has been designed to be cloud based with an initial deployment to an on-premises containerised hosting environment and with a view to migrate to hybrid deployment in future.
This has required providing solutions to typical microservice problems such as API versioning schemes, unified gateways and the ability to scale, but also some more complex issues specific to highly regulated organisations. These have included the ability to model the different levels of security that must be applied to data (down to a per attribute or field level) and the ability to replicate and store data across different physical locations, yet retain performant access to it.
This was an example of a relatively short project (but was part of a broader engagement to investigate alternative large-scale risk calculation systems) at a large financial institution. The client had an existing and under-utilised large-scale deployment of Hadoop on the one hand and a lack of sufficient cores to process the ever-growing risk metrics, required for regulatory purposes, on the other.
We were engaged to look into the possibility of using the Hadoop cluster, not only for storing the risk metrics, but also to perform the calculation. Not a task that Hadoop is generally used for.
As the regulators began to address the shortcoming of risk management in financial institutions post 2008, this resulted in a large increase in the volumes of risk metrics requiring calculation overnight. Couple this with an environment where operating costs needed to be reduced at the same time, this provided a challenging backdrop for this particular engagement.
A large aspect of the client’s requirements was to run additional regulatory risk load without introducing additional hardware or increasing the run rates. Further, as is the case with many such systems, the necessary performance metrics and telemetry were not available in order to be able to quickly identify areas of improvement.
We achieved this by profiling the overall system and identifying inefficiencies in hardware utilisation. Redistribution load through different points in the system resulted in a saving of over $500,000 per annum, after introducing the additional load.
Our work in this space has been successfully delivered with multiple engagements from a number of clients all heading in this direction. Unlike other case studies above we are therefore at liberty to disclose the complete design.
Running large HPC clusters has a number of well-established challenges that are solved by the various incumbent solutions from providers such as IBM and TIBCO. Moving the computation to a cloud-based infrastructure, particularly a hybrid of on and off premises cloud, introduces a series of new and different difficulties. These are not adequately addressed by the existing solutions (though of course the existing solutions can be used to varying degrees of success).
This is a work in progress reference architecture rather than a complete solution as the specifics of an implementation would vary from client to client. We hope to evolve this to include some of the fundamental components as open source software.
The design is available on GitHub.