Foreign and Commonwealth Office: harnessing the power of the cloud How Scott Logic developed a secure cloud platform and intuitive front end for analysing open source data, using Amazon Web Services and machine learning.

Foreign and Commonwealth Office: harnessing the power of the cloud Foreign and Commonwealth Office: harnessing the power of the cloud
The Client

The Foreign and Commonwealth Office’s Open Source Unit aims to increase the use of open data to help FCO deliver across a raft of policy priorities, including crisis response, consular services, and security and sustainability.

The Challenge

Wide range of data in a changing world

Analysts working with the Foreign and Commonwealth Office use a huge range of open data sources to support the goals of the department. This data comes in a wide range of types and formats, and is accessible via multiple sources, each with differing interfaces and querying capabilities.

In our fast-changing world, analysts need a central place to quickly access the exact data they need, as well as a space to securely exchange ideas and information. To this end, the Foreign and Commonwealth Office’s Open Source Unit sought a partner to help them develop a secure cloud data portal and a suite of user-friendly analysis tools.

Scott Logic was selected to design and build a new cloud platform using AWS that was intuitive, easy to use and built to the NCSC standards for handling workloads classified at “official”.

"I’ve been hugely impressed with the Scott Logic team … They have continually demonstrated considerable expertise across the range of technologies that we’ve been working with, and consequently have been able to deliver a fantastic product in a very short space of time."

Greg Haigh, Head of Data Science at Foreign and Commonwealth Office

The Solution

Serverless data pipelines, for maximum agility

The architecture was designed to be secure from the outset, whilst allowing for easy expansion and development as the project progressed. More specifically, it allows FCO data scientists with no in-depth technical knowledge to add and mutate data sources themselves, providing the agility our client desired. As well as providing flexibility, the event-based nature of the pipeline design provides seamless scalability and resilience against failures; and the consumption-based pricing model for serverless technologies also makes this an exceptionally cost-effective design for the ingestion process.

Phase one of the project concentrated on designing and building a secure data platform and ingestion pipeline that could cope with the range and variety of data sources and formats needed (html, csv, APIs, etc.). An initial architecture was proposed to the client, hosting an Elasticsearch server within a secure VPC, and using AWS Lambda to extract and transform the data, prior to performing named entity recognition within another lambda.

Serverless data pipelines, for maximum agility

Simple but powerful user interface

Phase two focused on the development of a new front-end application and began with a full-day UX workshop. As is often the case, “simple is hard.” In particular, the search input represented an interesting challenge: to create a component to support advanced Boolean logic, whilst retaining simplicity of search. The final results allow for both fuzzy matching and exact searches, with the ability to use various logical operators to further refine the queries, written in an intuitive manner for end-users.

The workshop and subsequent design phases enabled us to design key pages for the new application, as well as concepts for the platform’s future expansion. The design was implemented using Vue.js and D3, and hosted securely within AWS using S3, CloudFront and Cognito.

A comment and chat system was also developed, using serverless technologies. This enabled users to create chat rooms around specific topics and securely share articles and information with their colleagues. All communication within the site is handled securely with an in-built link-shortening service obfuscating the content.

Delivery

Delivered by Scott Logic

  • Development of a secure Elasticsearch cluster hosted on AWS EC2 instances
  • Implementation of AWS Lambda-based serverless data pipelines
  • UX design and build of a modern Vue.js interface to query and analyse the data
Delivered by Scott Logic