A leading human resource capital management in DC area with over 2B user records containing contact and other pertinent information used by hospitals and medical facilities.


Process automation to increase efficiency

Setting the scene

The company discussed in this case study is a leading human resource capital management in DC area, with over 2B user records containing contact and other pertinent information used by hospitals and medical facilities.

Its records, used for background verification and employment eligibility, are received by batch files every month which need to be processed and reconciled with existing records.

The ingestion process validates the completeness of the incoming records and stores it in a staging MongoDB database before the processing app picks up the record, searches a SOLR cluster and updates appropriately.

This process currently takes over 30 days to process, causing backlogs, inefficiencies and missed revenue opportunities.

The client has approached us to offer a hosting solution in AWS, but cautioned us against making any changes to application stack, as it was a HIPAA compliant / approved configuration.


Our engineers proposed a streamlined approach to build several clusters in AWS:

  • a 100-node MongoDB
  • a 100-node SOLR
  • a 100-node application clusters to quickly process the batches within 3 days

All the parameters, such as OS, software version, connection information, deployment artifacts, number of instances, disk specifications and database authentication that were required to build these clusters, were specified in configuration files.

The clusters were then created and destroyed on demand using Chef tools so they can be easily tweaked to meet increased workload based on the number of files to process and speed of ingestion / processing.

Our engineers developed all the Chef recipes and several utilities to monitor, stop, start, destroy, run arbitrary commands in a dynamic environment including code deployment.


The cluster contained several moving parts that needed to be coordinated precisely and started in a specific order.

The customer had a tight budget and the time allocated to understand the environment was very extremely aggressive (4 weeks). This included:

  • choosing the right instance sizes, memory footprints, storage size and performance
  • developing of the Chef infrastructure
  • integrating testing and validation of the design. Customer wanted the new environment to process next set of batches expected at end of January 2016.

We had to adhere to stack specification including OS, patch level, software apps (Java, Tomcat, MongoDB and SOLR)


The customer now has an automated way to build and deploy infrastructure cluster rapidly using configuration parameters.

Prior to this project, the customer had zero experience using AWS. Now they are able to manage complex infrastructures within matter of weeks.

We helped automated processes (such as code delivery to S3 buckets) to achieve long-pending compliance requirements, such as code archival, DR/BCP and infrastructure automation.